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WO 02/30268 PCT/US01/32045 



METHODS OF DIAGNOSIS OF PROSTATE CANCER, 
COMPOSITIONS AND METHODS OF SCREENING FOR 
MODULATORS OF PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application claims priority from the following applications: USSN 
09/687,576 filed October 13, 2000, USSN 60/276,791 filed March 16, 2001; USSN 
60/288,589, filed May 4, 2001; USSN 09/733,742, filed December 8, 2000; USSN 
10 09/733,288, filed December 8, 2000; USSN 09/847,046, filed April 30, 2001; USSN 
60/276,888, filed March 16, 2001; USSN 60/286,214, filed April 24, 2001; USSN 
60/281,922, filed April 6, 2001; USSN 60/263,957, filed January 24, 2001, which are 
incorporated herein by reference in their entirety. 

15 FIELD OF THE INVENTION 

The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
prostate cancer; and to the use of such expression profiles and compositions in the diagnosis, 
prognosis and therapy of prostate cancer. The invention further relates to methods for 
20 identifying and using agents and/or targets that inhibit prostate cancer. 

BACKGROUND OF THE INVENTION 
Prostate cancer is the most commonly diagnosed internal malignancy and 
second most common cause of cancer death in men in the U.S., resulting in approximately 
25 40,000 deaths each year ( Landis et al., CA Cancer J. Clin. 48:6-29 (1998); Greenlee et al., 
CA Cancer J. Clin. 50(1):7-13 (2000)), and incidence of prostate cancer has been increasing 
rapidly over the past 20 years in many parts of the world (Nakata et al., Int. J. Urol. 
7(7):254-257 (2000); Majeed et al., BJUInt. 85(9): 1058-1062 (2000)). It develops as the 
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result of a pathologic transformation of normal prostate cells. In tumorigenesis, the cancer 
cell undergoes initiation, proliferation and loss of contact inhibition, culminating in invasion 
of surrounding tissue and, ultimately, metastasis. 

Deaths from prostate cancer are a result of metastasis of a prostate tumor. 
5 Therefore, early detection of the development of prostate cancer is critical in reducing 

mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become 
a very common method for early detection and screening, and may have contributed to the 
slight decrease in the mortality rate from prostate cancer in recent years (Nowroozi et al., 
Cancer Control 5(6):522-531 (1998)). However, many cases are not diagnosed until the 

10 disease has progressed to an advanced stage. 

Treatments such as surgery (prostatectomy) , radiation therapy, and 
cryotherapy are potentially curative when the cancer remains localized to the prostate. 
Therefore, early detection of prostate cancer is important for a positive prognosis for 
treatment. Systemic treatment for metastatic prostate cancer is limited to hormone therapy 

15 and chemotherapy. Chemical or surgical castration has been the primary treatment for 
symptomatic metastatic prostate cancer for over 50 years. This testicular androgen 
deprivation therapy usually results in stabilization or regression of the disease (in 80% of 
patients), but progression of metastatic prostate cancer eventually develops (Panvichian et al., 
Cancer Control 3(6):493-500 (1996)). Metastatic disease is currently considered incurable, 

20 and the primary goals of treatment are to prolong survival and improve quality of life (Rago, 
Cancer Control 5(6):513-521 (1998)). 

Thus, methods that can be used for diagnosis and prognosis of prostate cancer 
and effective treatment of prostate cancer, and including particularly metastatic prostate 
cancer, would be desirable. Accordingly, provided herein are methods that can be used in 

25 diagnosis and prognosis of prostate cancer. Further provided are methods that can be used to 
screen candidate bioactive agents for the ability to modulate, e.g., treat, prostate cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in prostate cancer and other cancers. 
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SUMMARY OF THE INVENTION 
The present invention therefore provides nucleotide sequences of genes that 
are up- and down-regulated in prostate cancer cells. Such genes are useful for diagnostic 
purposes, and also as targets for screening for therapeutic compounds that modulate prostate 
5 cancer, such as hormones or antibodies. Other aspects of the invention will become apparent 
to the skilled artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
10 sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the present invention provides a method of determining 
the level of a prostate cancer associated transcript in a cell from a patient 

In one embodiment, the present invention provides a method of detecting a 
prostate cancer-associated transcript in a cell from a patient, the method comprising 
15 contacting a biological sample from the patient with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1-16. In another embodiment, the 
polynucleotide comprises a sequence as shown in Tables 1-16. 
20 In one embodiment, the biological sample is a tissue sample. In another 

embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent 

label. 

In one embodiment, the polynucleotide is immobilized on a solid surface. 
25 In one embodiment, the patient is undergoing a therapeutic regimen to treat 

prostate cancer. In another embodiment, the patient is suspected of having metastatic 
prostate cancer. 

In one embodiment, the patient is a human. 

In one embodiment, the patient is suspected of having a taxol-resistant cancer. 
30 In one embodiment, the prostate cancer associated transcript is mRNA. 
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In one embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the 
efficacy of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
5 providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 
the efficacy of the therapy. Li a further embodiment, the patient has metastatic prostate 
10 cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

Li one embodiment, the method further comprises the step of: (iii) comparing 
the level of the prostate cancer-associated transcript to a level of the prostate cancer- 
associated transcript in a biological sample from the patient prior to, or earlier in, the 

15 therapeutic treatment. 

Additionally, provided herein is a method of evaluating the effect of a 
candidate prostate cancer drug comprising administering the drug to a patient and removing a 
cell sample from the patient The expression profile of the cell is then determined. This 
method may further comprise comparing the expression profile to an expression profile of a 

20 healthy individual. In a preferred embodiment, said expression profile includes a gene of 
Tables 1-16. 

Li one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-16. 

In one embodiment, an expression vector or cell comprises the isolated nucleic 

25 acid. 

In one aspect, the present invention provides an isolated polypeptide which is 
encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-16. 

In another aspect, the present invention provides an antibody that specifically 
binds to an isolated polypeptide which is encoded by a nucleic acid molecule having 
30 polynucleotide sequence as shown in Tables 1-16. 
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In one embodiment, the antibody is conjugated to an effector component, e.g., 
a fluorescent label, a radioisotope or a cytotoxic chemical. 

In one embodiment, the antibody is an antibody fragment In another 
embodiment, the antibody is humanized. 
5 In one aspect, the present invention provides a method of detecting a prostate 

cancer cell in a biological sample from a patient, the method comprising contacting the 
biological sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting 
antibodies specific to prostate cancer in a patient, the method comprising contacting a 
10 biological sample from the patient with a polypeptide encoded by a nucleic acid comprising a 
sequence from Tables 1-16. 

In another aspect, the present invention provides a method for identifying a 
compound that modulates a prostate cancer-associated polypeptide, the method comprising 
the steps of: (i) contacting the compound with a prostate cancer-associated polypeptide, the 
15 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 
80% identical to a sequence as shown in Tables 1-16; and (ii) determining the functional 
effect of the compound upon the polypeptide. 

Li one embodiment, the functional effect is a physical effect, an enzymatic 
effect, or a chemical effect. 
20 In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 

cell membrane. In another embodiment, the polypeptide is recombinant. 

In one embodiment, the functional effect is determined by measuring ligand 
binding to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting 
25 proliferation of a prostate cancer-associated cell to treat prostate cancer in a patient, the 

method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. 

In one embodiment, the compound is an antibody. 
In another aspect, the present invention provides a drug screening assay 
30 comprising the steps of: (i) administering a test compound to a mammal having prostate 

cancer or to a cell sample isolated therefrom; (ii) comparing the level of gene expression of a 

5 



WO 02/30268 



PCTAJS01/32045 



polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 
as shown in Tables 1-16 in a treated cell or mammal with the level of gene expression of the 
polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
5 cancer. 

In one embodiment, the control is a mammal with prostate cancer or a cell 
sample therefrom that has not been treated with the test compound. In another embodiment, 
the control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
10 concentrations. In another embodiment, the test compound is administered for varying time 
periods, hi another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

hi one embodiment, the levels of a plurality of polynucleotides that selectively 

hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16 are 
15 individually compared to their respective levels in a control cell sample or mammal. In a 

preferred embodiment the plurality of polynucleotides is from three to ten. 

hi another aspect, the present invention provides a method for treating a 

mammal having prostate cancer comprising administering a compound identified by the 

assay described herein. 
20 In another aspect, the present invention provides a pharmaceutical 

composition for treating a mammal having prostate cancer, the composition comprising a 

compound identified by the assay described herein and a physiologically acceptable 

excipient 

In one aspect, the present invention provides a method of screening drug 
25 candidates by providing a cell expressing a gene that is up- and down-regulated as in a 
prostate cancer. In one embodiment, a gene is selected from Tables 1-16. The method 
further includes adding a drug candidate to the cell and determining the effect of the drug 
candidate on the expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes 
30 comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
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candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
expression profile genes. The profile genes may show an increase or decrease. 

Also provided is a method of evaluating the effect of a candidate prostate 
5 cancer drug comprising administering the drug to a transgenic animal expressing or 
over-expressing the prostate cancer modulatory protein, or an animal lacking the prostate 
cancer modulatory protein, for example as a result of a gene knockout 

Moreover, provided herein is a biochip comprising one or more nucleic acid 
segments of Tables 1-16, wherein the biochip comprises fewer than 1000 nucleic acid probes. 

10 Preferably, at least two nucleic acid segments are included. More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate 
cancer is provided. The method comprises determining the expression of a gene of Tables 1- 
16, in a first tissue type of a first individual, and comparing the distribution to the expression 

15 of the gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

Li a further embodiment, the biochip also includes a polynucleotide sequence 
of a gene that is not up- and down-regulated in prostate cancer. 

20 In one embodiment a method for screening for a bioactive agent capable of 

interfering with the binding of a prostate cancer modulating protein (prostate cancer 
modulatory protein) or a fragment thereof and an antibody which binds to said prostate 
cancer modulatory protein or fragment thereof. In a preferred embodiment, the method 
comprises combining a prostate cancer modulatory protein or fragment thereof, a candidate 

25 bioactive agent and an antibody which binds to said prostate cancer modulatory protein or 
fragment thereof. The method further includes determining the binding of said prostate 
cancer modulatory protein or fragment thereof and said antibody. Wherein there is a change 
in binding, an agent is identified as an interfering agent. The interfering agent can be an 
agonist or an antagonist Preferably, the agent inhibits prostate cancer. 

30 Also provided herein are methods of eliciting an immune response in an 

individual. In one embodiment a method provided herein comprises administering to an 
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individual a composition comprising a prostate cancer modulating protein, or a fragment 
thereof. In another embodiment, the protein is encoded by a nucleic acid selected from those 
of Tables 1-16. 

Further provided herein are compositions capable of eliciting an immune 
5 response in an individual. In one embodiment, a composition provided herein comprises a 
prostate cancer modulating protein, preferably encoded by a nucleic acid of Tables 1-16, or a 
fragment thereof, and a pharmaceutically acceptable carrier. Li another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1-16, and a 

10 pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer 
protein, or a fragment thereof, comprising contacting an agent specific for said protein with 
said protein in an amount sufficient to effect neutralization. Li another embodiment, the 
protein is encoded by a nucleic acid selected from those of Tables 1-16. 

15 In another aspect of the invention, a method of treating an individual for 

prostate cancer is provided. In one embodiment, the method comprises administering to said 
individual an inhibitor of a prostate cancer modulating protein. In another embodiment, the 
method comprises administering to a patient having prostate cancer an antibody to a prostate 
cancer modulating protein conjugated to a therapeutic moiety. Such a therapeutic moiety can 

20 be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and prognosis evaluation for prostate cancer (PC), including 
25 metastatic prostate cancer, as well as methods for screening for compositions which modulate 
prostate cancer. Also provided are methods for treating prostate cancer. 

In addition to the other nucleic acid and peptide sequences, the present 
invention also relates to the identification of PAA2 as a gene that is highly over expressed in 
prostate cancer patient tissues. PAA2 sequence is identical to the zinc transporter ZNT4. 
30 Results presented herein demonstrate that PAA2/ZNT4 is highly expressed in prostate cancer 
cells. The prostate gland is unique in that it has the highest capacity of any organ in the body 
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to accumulate zinc. Zinc uptake is regulated by prolactin and testosterone, which induce the 
expression of a member of the ZIP family of zinc transporters (Costello et al., 1999, J. Biol. 
Chem. 274:17499-17504). Zinc accumulation in the prostate functions to inhibit citrate 
oxidation, which results in a decrease in cellular ATP production (Costello and Franklin, 

5 1998, Prostate 35:285-296). Cancer cells are more sensitive to decreased ATP production 
and have evolved to prevent zinc accumulation. Without wishing to be bound by theory, the 
up-regulation of ZNT4 in prostate cancer cells may result in protection of the cells from high 
zinc levels by its ability to pump accumulated zinc out of the cells. 

The present invention also relates to nucleic acid sequencess encoding PBH1. 

10 PBH1 is related to human TRPC7 (transient receptor potential-related channels, NP_003298), 
a putative calcium channel highly expressed in brain (Nagamine et al., Genomics 54:124-131 
(1998)). Trp is related to melastatin, a gene down-regulated in metastatic melanomas 
(Duncan et al., Cancer Res. 58:1515-1520 (1998)), and MTR1, a gene locallized to within the 
Beckwith- Wiedemann syndrome/Wilm's tumor susceptability region (Prawitt et al., Hum. 

15 Mol. Genet 9:203-216 (2000)). Without wishing to be bound by theory, it is believed that 
PBH1 functions as a calcium channel. 

As a calcium channel, PBH1 is an ideal target for a small molecule 
therapeutic, or a therapeutic antibody that disrupts channel function. CD20, the target of 
Rituximab in non-Hodgekin's lymphoma (Maloney et al., Blood 90:2188-2195 (1997); Leget 

20 and Czuczman, Curr. Opin. Oncol. 10:548-551 (1998)), is a plasma membrane calcium 
channel expressed in B cells (Tedder andEngel, Immunol. Today 15:450-454 (1994)). 
Similarly, a small molecule, or antibody that inhibits or alters a calcium signal mediated by 
PBH1, will result in the death of prostate cancer cells. 

PBH1, and other genes of the invention, are also be useful as targets for 

25 cytotoxic T-lymphocytes. Genes that are tumor specific, or that are expressed in immune- 
privileged organs, are currently being used as potential vaccine targets (Van den Eynde and 
Boon, Int. J. Clin. Lab. Res. 27:81-86 (1997)). The expression pattern of PBH1 indicates that 
it is an ideal target for cytotoxic T-lymphocytes. Thus, therapies that utilize PBH1 -specific 
cytotoxic T-lymphocytes to induce prostate cancer cell death are also provided by this 

30 invention. See, e.g., U.S. Patent No. 6,051,227 and WO 00/32231, the disclosures of which 
are herein incorporated by reference. 
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The present invention is also related to the identification of PAA3 as a gene 
that is important in the modulation of prostate cancer and or breast cancer. 

Tables 1-16 provide unigene cluster identification numbers, exemplar 
accession numbers, or genomic nucleotide position numbers for the nucleotide sequence of 
5 genes that exhibit increased or decreased expression in prostate cancer samples. 

Definitions 

The term 'prostate cancer protein" or ''prostate cancer polynucleotide" or 
"prostate cancer-associated transcript" refers to nucleic acid and polypeptide polymorphic 

10 variants, alleles, mutants, and interspecies homologies that: (1) have a nucleotide sequence 
that has greater than about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 
90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide 
sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more nucleotides, to a nucleotide sequence of or associated with a unigene 

15 cluster of Tables 1-16; (2) bind to antibodies, e.g, polyclonal antibodies, raised against an 
immunogen comprising an amino acid sequence encoded by a nucleotide sequence of or 
associated with a unigene cluster of Tables 1-16, and conservatively modified variants 
thereof; (3) specifically hybridize under stringent hybridization conditions to a nucleic acid 
sequence, or the complement thereof of Tables 1-16 and conservatively modified variants 

20 thereof or (4) have an amino acid sequence that has greater than about 60% amino acid 

sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98% or 99% or greater amino sequence identity, preferably over a region of over 
a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid 
sequence encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 

25 1-16. A polynucleotide or polypeptide sequence is typically from a mammal including, but 
not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster, cow, pig, horse, sheep, 
or other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms. 

A "full length" prostate cancer protein or nucleic acid refers to a prostate 

30 cancer polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the 
elements normally contained in one or more naturally occurring, wild type prostate cancer 

10 
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polynucleotide or polypeptide sequences. For example, a full length prostate cancer nucleic 
acid will typically comprise all of the exons that encode for the full length, naturally ocurring 
protein. The "full length" may be prior to, or after, various stages of post-translation 
processing or splicing, including alternative splicing. 
5 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
transcript Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

10 blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 
biological sample is typically obtained from a eukaryotic organism, most preferably a 
mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

15 "Providing a biological sample" means to obtain a biological sample for use in 

methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 

20 be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 

25 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nIm.nih.gov/BLAST/ or the like). Such sequences are then said to 

30 be "substantially identical." This definition also refers to, or may be applied to, the 

compliment of a test sequence. The definition also includes sequences that have deletions 

11 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 

10 program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
one of the number of contiguous positions selected from the group consisting typically of 

15 from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which 
a sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

20 Math. 2:482 (198 1), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. 
Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. 
Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

25 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al. , eds. 
1995 supplement)). 

Preferred examples of algorithms that are suitable for determining percent 
sequence identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et 

30 al, J. Mol. Biol. 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters 
described herein, to determine percent sequence identity for the nucleic acids and proteins of 
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the invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
5 threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 

10 for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 

15 due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

20 defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl Acad. Sci. USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Ptoc. Nat'l. Acad. Sci. USA 90:5873- 

25 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N))> which provides an indication of the probability by which a match • 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 

30 preferably less than about 0.01, and most preferably less than about 0.001. Log values may 
be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 
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An indication that two nucleic acid sequences or polypeptides ate substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
5 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

10 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

15 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

20 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

25 that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. 'Turify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers, those 
containing modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function similarly to the naturally 

5 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 

modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

15 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

25 most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 

30 polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 
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only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 
5 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

10 Conservative substitution tables providing functionally similar amino acids are well known in 
the art Such conservatively modified variants are in addition to and do not exclude 
polymorphic variants, interspecies homologs, and alleles of the invention.typically 
conservative substitutions for one another. 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 

15 Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, 
e.g., Creighton, Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 

20 e.g., Alberts et al. , Molecular Biology of the Cell (3 rd eiL, 1994) and Cantor & Schimmel, 
Biophysical Chemistry Parti: The Conformation of Biological Macromolecules (1980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 

25 often form a compact unit of the polypeptide and are typically 25 to approximately 500 
amino acids long. Typical domains are made up of sections of lesser organization such as 
stretches of P-sheet and ct-helices. 'Tertiary structure" refers to the complete three 
dimensional structure of a polypeptide monomer. "Quaternary structure" refers to the three 
dimensional structure formed, usually by the noncovalent association of independent tertiary 

30 units. Anisotropic terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more 
nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and 
5 polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 
1000, 2000, 3000, 5000, 7000, 10,000, etc. A nucleic acid of the present invention will 
generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are 
included that may have alternate backbones, comprising, e.g., phosphoramidate, 
phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, 

10 Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & 

15 Cook, eds.. Nucleic acids containing one or more carbocyclic sugars are also included within 
one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done 
for a variety of reasons, e.g. to increase the stability and half-life of such molecules in 
physiological environments or as probes on a biochip. Mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 

20 analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for 
example, phosphoramidate (Beaucage et al., Tetrahedron 49(10): 1925 (1993) and references 
therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 
(1977); Letsinger et al., Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett 805 

25 (1984), Letsinger et al., J. Am. Chem. Soc. 1 10:4470 (1988); and Pauwels et al., Chemica 
Scripta 26:141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19:1437 (1991); 
and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al., J. Am. Chem. Soc. 111:2321 
(1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: 
A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and 

30 linkages (see Egholm, J. Am. Chem. Soc. 1 14: 1 895 (1992); Meier et al., Chem. Int. Ed. Engl. 
31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all 
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of which are incorporated by reference). Other analog nucleic acids include those with 
positive backbones (Denpcy et al., Proc. Natl. Acad Sci. USA 92:6097 (1995); non-ionic 
backbones (U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; 
Kiedrowshi et al., Angew. Chem. Intl. Ed English 30:423 (1991); Letsinger et al., J. Am. 
5 Chem. Soc. 1 10:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13: 1597 (1994); 
Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense 
Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal 
Chem. Lett. 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 
37:743 (1996)) and non-ribose backbones, including those described in U.S. Patent Nos. 

10 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate 
Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids 
containing one or more carbocyclic sugars are also included within one definition of nucleic 
acids (see Jenkins et al., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogs 
are described in Rawls, C&E News June 2, 1997 page 35. All of these references are hereby 

15 expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

20 kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 
due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 

25 enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 
by those in the art, the depiction of a single strand also defines the sequence of the 
complementary strand; thus the sequences described herein also provide the complement of 

30 the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
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combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 
xanthine hypoxanthine, isocytosine, isoguanine, etc. "Transcript" typically refers to a 
naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
5 nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g. the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 

10 means. For example, useful labels include fluorescent dyes, electron-dense reagents, enzymes 
(e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins or other 
entities which can be made detectable, e.g., by incorporating a radiolabel into the peptide or 
used to detect antibodies specifically reactive with the peptide. The radioisotope may be, for 
example, 3H, 14C, 32P, 35S, or 125L In some cases, particularly using antibodies against the 

15 proteins of the invention, the radioisotopes are used as toxic moieties, as described below. 
The labels may be incorporated into the prostate cancer nucleic acids, proteins and antibodies 
at any position. Any method known in the art for conjugating the antibody to the label may 
be employed, including those methods described by Hunter et al., Nature . 144:945 (1962); 
David et al., Biochemistry . 13:1014 (1974); Pain et al., J. Immunol. Meth. . 40:219 (1981); 

20 and Nygren, J. Histochem. and Cvtochem. . 30:407 (1982). The lifetime of radiolabeled 
peptides or radiolabeled antibody compositions may extended by the addition of substances 
that stablize the radiolabeled peptide or antibody and protect it from degradation. Any 
substance or combination of substances that stablize the radiolabeled peptide or antibody may 
be used including those substances disclosed in US Patent No. 5,961,955. 

25 An "effector" or "effector moiety" or "effector component" is a molecule that 

is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van derWaals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 

30 tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 
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A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 

5 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

10 through hydrogen bond formation. As used herein, a probe may include natural 0.e., A G, C, 
orT) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not 
functionally interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in 
which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. 

15 It will be understood by one of skill in the art that probes may bind target sequences lacking 
complete complementarity with the probe sequence depending upon the stringency of the 
hybridization conditions. The probes are preferably directly labeled as with isotopes, 
chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a 
streptavidin complex may later bind. By assaying for the presence or absence of the probe, 

20 one can detect the presence or absence of the select sequence or subsequence. Diagnosis or 
prognosis may be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

25 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., 
recombinant cells express genes that are not found within the native (non-recombinant) form 
of the cell or express native genes that are otherwise abnormally expressed, under expressed 
or not expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

30 polymerases and endonucleases, in a form not normally found in nature. In this manner, 

operably linkage of different sequences is achieved Thus an isolated nucleic acid, in a linear 
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form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
5 host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

10 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

15 coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g, 
a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
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particular nucleic acid in a host cell. Hie expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

5 duplexing, or hybridizing of a molecule only to a particular nucleotide sequence that is 
determinative of the presence of the nucleotide sequence, in a heterogeneous population of 
nucleic acids and other biologies (e.g., total cellular or library DNA or RNA). Similarly, the 
phrase "specifically (or selectively) binds" to an antibody or "specifically (or selectively) 
immunoreactive with," when referring to a protein or peptide, refers to a binding reaction that 

10 is determinative of the presence of the protein, in a heterogeneous population of proteins and 
other biologies. Thus, under designated immunoassay or nucleic acid hybridization 
conditions, the specified antibodies or nucleic acid probes bind to a particular protein 
nucleotide sequences at least two times the background and more typically more than 10 to 
100 times background. 

15 Specific binding to an antibody under such conditions requires an antibody 

that is selected for its specificity for a particular protein. For example, polyclonal antibodies 
raised to a particular protein, polymorphic variants, alleles, orthologs, and conservatively 
modified variants, or splice variants, or portions thereof, can be selected to obtain only those 
polyclonal antibodies that are specifically immunoreactive with the desired prostact cancer 

20 protein and not with other proteins. This selection may be achieved by subtracting out 

antibodies that cross-react with other molecules. A variety of immunoassay formats may be 
used to select antibodies specifically immunoreactive with a particular protein. For example, 
solid-phase ELK A immunoassays are routinely used to select antibodies specifically 
immunoreactive with a protein {see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual 

25 (1988) for a description of immunoassay formats and conditions that can be used to 
determine specific immunoreactivity). 

The phrase "stringent hybridization conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 

22 



WO 02/30268 PCT/US0i;32045 



Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point Clm) for the specific sequence at a defined ionic strength pH. The T m is 
5 the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 

10 salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background, preferably 10 times background hybridization. Exemplary stringent 

15 hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°G, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

20 62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 

25 reactions are provided, e.g., in Innis et al. (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N.Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon 

30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
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under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 
background. Those of ordinary skill will readily recognize that alternative hybridization and 

5 wash conditions can be utilized to provide conditions of similar stringency. Additional 

guidelines for determining hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et aL 

The phrase "functional effects" in the context of assays for testing compounds 
that modulate activity of a prostate cancer protein includes the determination of a parameter 

10 that is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, 
e.g., a functional, physical, or chemical effect, such as the ability to decrease prostate cancer. 
It includes ligand binding activity; cell growth on soft agar; anchorage dependence; contact 
inhibition and density limitation of growth; cellular proliferation; cellular transformation; 
growth factor or serum dependence; tumor specific marker levels; invasiveness into Matrigel; 

15 tumor growth and metastasis in vivo; mRNA and protein expression in cells undergoing 
metastasis, and other characteristics of prostate cancer cells. "Functional effects" include in 
vitro, in vivo, and ex vivo activities. 

By "determining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of a 

20 prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by any means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 

25 measuring binding activity or binding assays, e.g. binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 
the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 

30 transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 
The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 

5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, p-gal, GFP and the like), e.g, via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of prostate cancer polynucleotide 
and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 or compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide 
and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally 
block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate 
the activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic 
acids may seem to inhibit expression and subsequent function of the protein. "Activators" 

15 are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, 
or up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also 
include genetically modified versions of prostate cancer proteins, e.g., versions with altered 
activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g, 

20 expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then detennining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., 1,"2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 

25 or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1-16. 

Samples or assays comprising prostate cancer proteins that are treated with a 
potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 

30 (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 
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preferably 50%, more preferably 25-0%. Activation of a prostate cancer polypeptide is 
achieved when the activity value relative to the control (untreated with activators) is 110%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
5 The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage 
independence, semi-solid or soft agar growth, changes in contact inhibition and density 
limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

10 ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic 
Technique pp. 231-241 (3 ri ed. 1994). 

'Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells or "transformation" in tissue culture, refers 

15 to spontaneous or induced phenotypic changes that do not necessarily involve the uptake of 
new genetic material. Although transformation can arise from infection with a transforming 
virus and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 

20 aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney, 
Culture of Animal Cells a Manual of Basic Technique (3 rd ed. 1994)). 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 

25 epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobuUn classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 

30 Fundamental Immunology. 
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An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pah- 
having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
5 responsible for antigen recognition. The terms variable light chain (Vl) and variable heavy 
chain (Vh) refer to these light and heavy chains respectively. 

Antibodies exist e.g., as intact immunoglobulins or as a number of well- 
characterized fragments produced by digestion with various peptidases. Thus, e.g., pepsin 
digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a 

10 dimer of Fab which itself is a light chain joined to V h -Ch1 by a disulfide bond. The F(ab)'2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 
thereby converting the F(ab)'2 dimer into an Fab' monomer. The Fab' monomer is 
essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 

15 antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 
herein, also includes antibody fragments either produced by the modification of whole 
antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al, Nature 

20 348:552-554(1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 
Nature 256:495-497 (1975); Kozbor et al, Immunology Today 4:72 (1983); Cole et al., pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy (1985); Coligan, Current Protocols in 

25 Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, 
Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the 
production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized antibodies. Alternatively, phage 

30 display technology can be used to identify antibodies and heteromeric Fab fragments that 
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specifically bind to selected antigens (see, e.g., McCafferty et aL, Nature 348:552-554 
(1990); Marks et aL, Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
5 (variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

10 

Identification of prostate cancer-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a "fingerprint" of the state of the 

15 sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from cancerous or metastatic cancerous tissue of the prostate, or 
prostate cancer tissue or metastatic prostate cancerous tissue can be compared with tissue 

20 samples of prostate and other tissues from surviving cancer patients. By comparing 

expression profiles of tissue in known different prostate cancer states, information regarding 
which genes are important (including both up- and down-regulation of genes) in each of these 
states is obtained. 

The identification of sequences that are differentially expressed in prostate 
25 cancer versus non-prostate cancer tissue allows the use of this information in a number of 

ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic 
drug act to down-regulate prostate cancer, and thus tumor growth or recurrence, in a 
particular patient Similarly, diagnosis and treatment outcomes may be done or confirmed by 
comparing patient samples with the known expression profiles. Metastatic tissue can also be 
30 analyzed to determine the stage of prostate cancer in the tissue. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
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mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 
comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
5 levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

10 Thus the present invention provides nucleic acid and protein sequences that 

are differentially expressed in prostate cancer, herein termed "prostate cancer sequences." As 
outlined below, prostate cancer sequences include those that are up-regulated (i.e., expressed 
at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., expressed 
at a lower level). In a preferred embodiment, the prostate cancer sequences are from humans; 

15 however, as will be appreciated by those in the art, prostate cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other 
prostate cancer sequences are provided, from vertebrates, including mammals, including 
rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, 
goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer sequences 

20 from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid 
sequences. As will be appreciated by those in the art and is more fully outlined below, 
prostate cancer nucleic acid sequences are useful in a variety of applications, including 
diagnostic applications, which will detect naturally occurring nucleic acids, as well as 

25 screening applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates 
with selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the prostate cancer sequences outlined herein. 
Such homology can be based upon the overall nucleic acid.or amino acid sequence, and is 

30 generally determined as outlined below, using either homology programs or hybridization 
conditions. 
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For identifying prostate cancer-associated sequences, the prostate cancer 
screen typically includes comparing genes identified in different tissues, e.g., normal and 
cancerous tissues, or tumor tissue samples from patients who have metastatic disease vs. non 
metastatic tissue. Other suitable tissue comparisons include comparing prostate cancer 
5 samples with metastatic cancer samples from other cancers, such as lung, breast, 

gastrointestinal cancers, ovarian, etc. Samples of different stages of prostate cancer, e.g., 
survivor tissue, drug resistant states, and tissue undergoing metastasis, are applied to biochips 
comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 

10 commercially available, e.g. from Affymetrix. Gene expression profiles as described herein 
are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between 
normal and disease states are compared to genes expressed in other normal tissues, preferably 
normal prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, 

15 muscle, colon, small intestine, large intestine, spleen, bone and placenta. In a preferred 
embodiment, those genes identified during the prostate cancer screen that are expressed in 
any significant amount in other tissues are removed from the profile, although in some 
embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable 
that the target be disease specific, to minimize possible side effects. 

20 In a preferred embodiment, prostate cancer sequences are those that are up- 

regulated in prostate cancer; that is, the expression of these genes is higher in the prostate 
cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein often 
means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. All unigene cluster identification numbers 

25 and accession numbers herein are for the GenBank sequence database and the sequences of 
the accession numbers are hereby expressly incorporated by reference. GenBank is known in 
the art, see, e.g., Benson, DA, et al. t Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 

30 In another preferred embodiment, prostate cancer sequences are those that are 

down-regulated in prostate cancer, that is, the expression of these genes is lower in prostate 
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cancer tissue as compared to non-cancerous tissue (see, e.g„ Tables 8, 12 and 14). "Down- 
regulation" as used herein often means at least about a 1.5-fold change more preferrably a 
two-fold change, preferably at least about a three fold change, with at least about five-fold or 
higher being most preferred. 

5 

Informatics 

The ability to identify genes that are over or under expressed in prostate 
cancer can additionally provide high-resolution, high-sensitivity datasets which can be used 

10 in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, protein 
structure, biosensor development, and other related areas. For example, the expression 
profiles can be used in diagnostic or prognostic evaluation of patients with prostate cancer. 
Or as another example, subcellular toxicological information can be generated to better direct 
drug structure and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, 

15 Mechanism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 
(June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

20 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of assay data. The data contained in the database is acquired, e.g., 
using array analysis either singly or in a library format The database can be in substantially 
any form in which data can be maintained and transmitted, but is preferably an electronic 

25 database. The electronic database of the invention can be maintained on any electronic 
device allowing for the storage of and access to the database, such as a personal computer, 
but is preferably distributed on a wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 

30 databases can be assembled for any assay data acquired using an assay of the invention. 
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The compositions and methods for identifying and/or quantitating the relative 
and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing prostate cancer, i.e., the identification of prostate cancer- 
associated sequences described herein, provide an abundance of information, which can be 

5 correlated with pathological conditions, predisposition to disease, drug testing, therapeutic 
monitoring, gene-disease causal linkages, identification of correlates of immunity and 
physiological status, among others. Although the data generated from the assays of the 
invention is suited for manual review and analysis, in a preferred embodiment, prior data 
processing using high-speed computers is utilized. 

10 An array of methods for indexing and retrieving biomolecular information is 

known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 
database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 

15 containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 
for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

20 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

25 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

30 the merger of two or more such tree structures. 
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See also Mount et al., Bioinformatics (2001); Biological Sequence Analysis: 
Probabilistic Models of Proteins and Nucleic Acids (Durbin et al., eds., 1999); 
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Baxevanis & 
Oeullette eds., 1998)); Rashidi & Buehler, Bioinformatics: Basic Applications in Biological 
5 Science and Medicine (1999); Introduction to Computational Molecular Biology (Setubal et 
al., eds 1997); Bioinformatics: Methods and Protocols (Misener & Krawetz, eds, 2000); 
Bioinformatics: Sequence, Structure, and Databanks: A Practical Approach (Higgins & 
Taylor, eds., 2000); Brown, Bioinformatics: A Biologist's Guide to Biocomputing and the 
Internet (2001); Han & Kamber, Data Mining: Concepts and Techniques (2000); and 

10 Waterman, Introduction to Computational Biology: Maps, Sequences, and Genomes (1995). 

The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

15 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for prostate cancer. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

20 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

Hie invention also provides for the storage and retrieval of a collection of 

25 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

30 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 
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the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides 

5 a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., PASTA TFASTA, GAP, BESTFTT) and/or the comparison may 

10 be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
(e.g., Linux, SunOS, Solaris, AK, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or 

15 hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 

20 ISDN line, wireless network, optical fiber, or other suitable signal transmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 

25 generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 

30 comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
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degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 

5 from the data file, which comprises a binary description of an assay result 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g. f DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g. , binding to a selected affinity moiety) and the 

10 same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 

15 device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 

20 that described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer, (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. : 

25 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
30 cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
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proteins often results in unregulated or disregulated cellular processes {see, e.g., Molecular 
Biology of the Cell (Alberts, ed., 3rd ed., 1994). For example, many intracellular proteins 
have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease 
activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins 

5 also serve as docking proteins that are involved in organizing complexes of proteins, or 
targeting proteins to various subcellular localizations, and are involved in niaintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 
in the proteins of one or more motifs for which defined functions have been attributed. In 

10 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. FIB domains, which are distinct from SH2 
domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

15 targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

20 enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman et al., Nuc. 

25 Acids Res. 28:263-266 (2000); Sonnhammer et al., Proteins 28:405-420 (1997); Bateman et 
al., Nuc. Acids Res. 27:260-262 (1999); and Sonnhammer et al., Nuc. Acids Res. 26:320-322- 
(1998)). 

In another embodiment, the prostate cancer sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
30 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
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for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 

5 of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 
guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

10 transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 
transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 

15 may be followed by charged amino acids. Therefore, upon analysis of the amino acid 
sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted {see, e.g, PSORT web site http://psort.nibb.ac.jp/). 
Important transmembrane protein receptors include, but are not limited to the insulin 
receptor, insulin-like growth factor receptor, human growth hormone receptor, glucose 

20 transporters, transferrin receptor, epidermal growth factor receptor, low density lipoprotein 
receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, e.g. IL-I 
receptor, IL-2 receptor, 

The extracellular domains of transmembrane proteins are diverse; however, 
conserved motifs are found repeatedly among various extracellular domains. Conserved 

25 structure and/or functions have been ascribed to different extracellular motifs. Many 

extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 
For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 

30 bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
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bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) 
anchor, or may themselves be transmembrane proteins. Extracellular domains also associate 
with the extracellular matrix and contribute to the maintenance of the cell structure. 

5 Prostate cancer proteins that are transmembrane are particularly preferred in 

the present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

10 typically permeablized to provide access to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 
be made soluble by removing transmembrane sequences, e.g. , through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

15 In another embodiment, the prostate cancer proteins are secreted proteins; the 

secretion of which can be either constitutive or regulated. These proteins have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 

20 an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 
on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Prostate cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 

25 for blood, plasma, serum, or stool tests. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by 
substantial nucleic acid and/or amino acid sequence homology or linkage to the prostate 
30 cancer sequences outlined herein. Such homology can be based upon the overall nucleic acid 
or amino acid sequence, and is generally determined as outlined below, using either 
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homology programs or hybridization conditions. Typically, linked sequences on a mRNA are 
found on the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-16, can be fragments of larger genes, i.e., they are nucleic acid 

5 segments. "Genes" in this context includes coding regions, non-coding regions, and mixtures 
of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, 
using the sequences provided herein, extended sequences, in either direction, of the prostate 
cancer genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et aL, supra. Much can be done by 

10 informatics and many sequences can be clustered to include multiple sequences 
corresponding to a single gene, e.g., systems such as UniGene (see, 
http7/www.ncbi.nlm.nih.govAJniGene7). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire prostate cancer nucleic acid 

15 coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
segment, the recombinant prostate cancer nucleic acid can be further-used as a probe to 
identify and isolate other prostate cancer nucleic acids, e.g., extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant prostate cancer nucleic 

20 acids and proteins. 

The prostate cancer nucleic acids of the present invention are used in several 
ways. In a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are 
made and attached to biochips to be used in screening and diagnostic methods, as outlined 
below, or for administration, e.g., for gene therapy, vaccine, and/or antisense applications. 

25 Alternatively, the prostate cancer nucleic acids that include coding regions of prostate cancer 
proteins can be put into expression vectors for the expression of prostate cancer proteins, 
again for screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic 
acids (both the nucleic acid sequences oudined in the figures and/or the complements thereof) 

30 are made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, i.e. the target sequence (either the target 
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sequence of the sample or to other probe sequences, e.g. f in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 

5 single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under normal reaction conditions, particularly high stringency 

10 conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 

15 and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. Li some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
either overlapping probes or probes to different sections of the target being used. That is, 

20 1 two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target The probes can be overlapping (i.e., have some sequence in common), 
or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

25 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 

30 attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
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equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 

5 support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as 
will be appreciated by those in the art As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 

10 the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 
support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 

15 in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 

20 plastics, etc. In general, the substrates allow optical detection and do not appreciably 

fluoresce. A preferred substrate is described in copending application entitled Reusable Low 
Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 1999, 
herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 

25 the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 
sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 

30 derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 

41 



WO 02/30268 PCTAJS01/32045 



amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g. using linkers as are known in the art; e.g., 
5 homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5' or 3' terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

15 bind to surfaces covalently coated with streptavidin, resulting in attachment 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/251 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays ate performed to measure the expression 
level of prostate cancer-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, a prostate cancer-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PCR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of prostate cancer-associated RNA Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
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quantitative PCR are provided, e.g., in Innis et al, PCR Protocols, A Guide to Meilwds and 
Applications (1990). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
5 dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3' end When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5* fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

10 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see Wu & Wallace, Genomics 4:560 (1989), Landegren et al., Science 
241:1077 (1988), and Barringer etal, Gene 89:117 (1990)), transcription amplification 
(Kwoh et al, Proc. Natl Acad. Sci. USA 86:1173 (1989)), self-sustained sequence replication 

15 (Guatelli et al., Proc. Nat. Acad Sci. USA 87: 1874 (1990)), dot PCR, and linker adapter PCR, 
etc. 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding 

20 prostate cancer proteins are used to make a variety of expression vectors to express prostate 
cancer proteins which can then be used in screening assays, as described below. Expression 
vectors and recombinant DNA technology are well known to those of skill in the art (see, 
e.g., Ausubel, supra, and Gene Expression Systems (Fernandez & Hoetfler, eds, 1999)) and 
are used to express proteins. The expression vectors may be either self-replicating 

25 extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 
linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism Control sequences that are suitable for prokaryotes, e.g., include a 

30 promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 
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Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 

5 to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 

10 restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the prostate cancer 
protein. Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 

15 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

20 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

25 example, the expression vector may have two replication systems, thus allowing it to be 
maintained in two organisms, e.g. in mammalian or insect cells for expression and in a 
procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct 

30 The integrating vector may be directed to a specific locus in the host cell by selecting the 
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appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. Selection genes are 

5 well known in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing 
a host cell transformed with an expression vector containing nucleic acid encoding a prostate 
cancer protein, under the appropriate conditions to induce or cause expression of the prostate 
cancer protein. Conditions appropriate for prostate cancer protein expression will vary with 

10 the choice of the expression vector and the host cell, and will be easily ascertained by one 
skilled in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 

15 is important For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, 

20 Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

Li a preferred embodiment, the prostate cancer proteins are expressed in 
mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. One expression vector system is a retroviral vector system 

25 such as is generally described in PCT/US97/01019 and PCT/US97/01048, both of which are 
hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 

30 and the CMV promoter (see, e.g., Fernandez & Hoeffler, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
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regions located 3' to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and pblyadenlyation signals 
include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as 
5 well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA 
into nuclei. 

10 In a preferred embodiment, prostate cancer proteins are expressed in bacterial 

systems. Bacterial expression systems are well known in the art Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 
promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

15 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

20 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

25 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 

30 the art, such as calcium chloride treatment, electroporation, and others. 
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In one embodiment, prostate cancer proteins are produced in insect cells. 
Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. 
5 Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorphs 
Kluyveromycesfragilis and K. lactis, Pichia guillerimondU and P. pastoris, 
Schizosaccharotnyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using 
10 techniques well known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the 
desired epitope is small, the prostate cancer protein may be fused to a carrier protein to form 
an immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 
15 acid for expression purposes. 

m a preferred embodiment, the prostate cancer protein is purified or isolated 
after expression. Prostate cancer proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 
20 and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes, Protein 
25 Purification (1982). The degree of purification necessary will vary depending on the use of 
the prostate cancer protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the prostate cancer proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
reagents, as vaccine reagents, as screening agents, etc. 

30 
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Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant 
prostate cancer proteins as compared to the wild-type sequence. That is, as outlined more 
fully below, the derivative prostate cancer peptide will often contain at least one amino acid 

5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
prostate cancer peptide. 

Also included within one embodiment of prostate cancer proteins of the 
present invention are amino acid sequence variants. These variants typically fall into one or 

10 more of three classes: substitutional, insertional or deletional variants. These variants 

ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the 
prostate cancer protein, using cassette or PCR mutagenesis or other techniques well known in 
the art, to produce DNA encoding the variant, and thereafter expressing the DNA in 
recombinant cell culture as outlined above. However, variant prostate cancer protein 

15 fragments having up to about 100-150 residues may be prepared by in vitro synthesis using 
established techniques. Amino acid sequence variants are characterized by the predetermined 
nature of the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the prostate cancer protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally occurring analogue, 

20 ..although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

25 conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

30 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 

48 



WO 02/30268 



PCT/US01/32045 



insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino acids to 

5 minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The Variants typically exhibit the same qualitative biological activity and will 

10 elicit the same immune response as the naturally-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by 

15 selecting substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 

20 polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
having an electropositive side chain, e.g. lysyl, arginyl, or hisudyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 

25 chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

Covalent modifications of prostate cancer polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
acid residues of a prostate cancer polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of a prostate 

30 cancer polypeptide. Derivatization with bifunctional agents is useful, for instance, for 

crosslinking prostate cancer polypeptides to a water-insoluble support matrix or surface for 
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use in the method for purifying anti-prostate cancer polypeptide antibodies or screening 
assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 
l,l-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
5 esters such as S^'-olthiobisCsuccinimidylpropionate), bifunctional maleimides such as bis-N- 
maleimido-l,8-octane and agents such as memyl-3-((p-azidophenyl)dithio)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 

10 methylation of the amino groups of the lysine, arginine, and histidine side chains (Creighton, 
Proteins: Structure and Molecular Properties, pp. 79-86 (1983)), acetylation of the N- 
terminal amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

15 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes 

herein to mean deleting one or more carbohydrate moieties found in native sequence prostate 
cancer polypeptide, and/or adding one or more glycosylation sites that are not present in the 
native sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many 
ways. For example the use of different cell types to express prostate cancer-associated 

20 sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 

25 amino acid sequence may optionally be altered through changes at the DNA level, 

particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

Another means of increasing the number of carbohydrate moieties on the 
prostate cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 

30 polypeptide. Such methods are described in the art, e.g., in WO 87/05330, and in Aplin & 

Wriston, CRC Crit. Rev. Biochem., pp. 259-306 (1981). 
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Removal of carbohydrate moieties present on the prostate cancer polypeptide 
may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by HaMmuddin, 

5 et al.. Arch. Biochem. Biophys., 259:52 (1987) and by Edge eial.,Anal. Biochenu, 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al.Meth. 
Enzymol., 138:350 (1987). 

Another type of covalent modification of prostate cancer comprises linking the 

10 prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in 
a way to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 

15 heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 

20 an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 

25 the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the 
art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field et 
al, Mol. Cell Biol 8:2159-2165 (1988)); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 

30 9E10 antibodies thereto (Evan et al., Molecular and Cellular Biology 5:3610-3616 (1985)); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky et al.. 
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Protein Engineering 3(6):547-553 (1990)). Other tag polypeptides include the Flag-peptide 
(Hopp et'dL, BioTechnology 6:1204-1210 (1988)); the KT3 epitope peptide (Martin et aL, 
Science 255:192-194 (1992)); tubulin epitope peptide (Skinner et at, J. Biol Chem. 
266:15163-15166 (1991)); and theT7 gene 10 protein peptide tag (Lutz-Freyermuth et aL, 

5 Proc. Natl. Acad. Set USA 87:6393-6397 (1990)). 

Also included are other prostate cancer proteins of the prostate cancer family, 
and prostate cancer proteins from other organisms, which are cloned and expressed as 
outlined below. Thus, probe or degenerate polymerase chain reaction (PCR) primer 
sequences may be used to find other related prostate cancer proteins from humans or other 

10 organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR 
primer sequences include the unique areas of the prostate cancer nucleic acid sequence. As is 
generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides 
in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. 
The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, 

15 supra). 

Antibodies to prostate cancer proteins 

In a preferred embodiment, when the prostate cancer protein is to be used to 
generate antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein 

20 should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. Li a preferred embodiment," the epitope is unique; that is, 

25 antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 

30 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
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may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 
trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
5 adjuvant and MPI^TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. • 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 
antibodies may be prepared using hybridoma methods, such as those described by Kohler & 

10 Milstein, Nature 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent. Alternatively, me lymphocytes may be immunized in vitro. The 
immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1- 

15 16 fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 

lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 
glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, 

20 pp. 59-103 (1986)). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 
the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

25 hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 

30 specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
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protein encoded by a nucleic acid Tables 1-16 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

5 hi a preferred embodiment, the antibodies to prostate cancer protein are 

capable of reducing or eliminating a biological function of a prostate cancer protein, as is 
described below. That is, the addition of anti-prostate cancer protein antibodies (either 
polyclonal or preferably monoclonal) to prostate cancer tissue (or cells containing prostate 
cancer) may reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in 
10 activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labsjnc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
15 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
20 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
25 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
30 immunoglobulin (Jones et al., Nature 321:522-525 (1986); Riechmann et al, Nature 

332:323-329 (1988); andPresta, Curr. Op. Struct. Biol. 2:593-596 (1992)). Humanization 
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can be essentially performed following the method of Winter and co-workers (Jones et al., 
Nature 321:522-525 (1986); Riechmann et al., Nature 332:323-327 (1988); Verhoeyen et al, 
Science 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
5 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom & Winter, J. Mol. Biol. 227:381 (1991); 

10 Marks et al, J. Mol Biol. 222:581 (1991)). The techniques of Cole et al. and Boerner et al. 
are also available for the preparation of human monoclonal antibodies (Cole et al, 
Monoclonal Antibodies and Cancer Therapy, p. 77 (1985) and Boerner et al., J. Immunol. 
147(l):86-95 (1991)). Similarly, human antibodies can be made by introducing of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

15 immunoglobulin genes have been partially or completely inactivated. Upon challenge, 
human antibody production is observed, which closely resembles that seen in humans in all 
respects, including gene rearrangement, assembly, and antibody repertoire. This approach is 
described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al., Bio/Technology 10:779- 

20 783 (1992); Lonberg et al., Nature 368:856-859 (1994); Morrison, Nature 368:812-13 
(1994); Hshwild et al., Nature Biotechnology 14:845-51 (1996); Neuberger, Nature 
Biotechnology 14:826 (1996); Lonberg & Huszar, Intern. Rev. Immunol 13:65-93 (1995). 

By immunotherapy is meant treatment of prostate cancer with an antibody 
raised against prostate cancer proteins. As used herein, immunotherapy can be passive or 

25 active. Passive immunotherapy as defined herein is the passive transfer of antibody to a 

recipient (patient). Active immunization is the induction of antibody and/or T-cell responses 
in a recipient (patient). Induction of an immune response is the result of providing the 
recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary 
skill in the art, the antigen may be provided by injecting a polypeptide against which 

30 antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic 
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acid capable of expressing the antigen and under conditions for expression of the antigen, 
leading to an immune response. 

In a preferred embodiment the prostate cancer proteins against which 
antibodies are raised are secreted proteins as described above. Without being bound by 
5 theory, antibodies used for treatment, bind and prevent the secreted protein from binding to 
its receptor, thereby inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which 
antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies 
used for treatment, bind the extracellular domain of the prostate cancer protein and prevent it 

10 from binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also an antagonist of the prostate cancer protein. 

15 Further, the antibody prevents activation of me transmembrane prostate cancer protein, m 
one aspect, when the antibody prevents the binding of other molecules to the prostate cancer 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-a, TNF-p\ ILrl, INF-y 
and Ej-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

20 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, prostate cancer is 
treated by administering to a patient antibodies directed against the transmembrane prostate 
cancer protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

25 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector 
moiety. The effector moiety can be any number of molecules, including labelling moieties 
such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect 
the therapeutic moiety is a small molecule that modulates the activity of the prostate cancer 

30 protein. In another aspect the therapeutic moiety modulates the activity of molecules 

associated with or in close proximity to the prostate cancer protein. The therapeutic moiety 
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may inhibit enzymatic activity such as protease or collagenase or protein kinase activity 
associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent In this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results 
5 in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 

10 radiochemicals made by conjugating radioisotopes to antibodies raised against prostate 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 
attached to the antibody. Targeting the therapeutic moiety to transmembrane prostate cancer 
proteins not only serves to increase the local concentration of therapeutic moiety in the 
prostate cancer afflicted area, but also serves to reduce deleterious side effects that may be 

15 associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. Li one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate 
cancer proteins. By "specifically bind" herein is meant that the antibodies bind to the protein 

25 with a Kd of at least about 0.1 mM, more usually at least about 1 uM, preferably at least about 
0.1 uM or better, and most preferably, 0.01 uM or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. Expression levels of genes in normal tissue 
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(i.e., not undergoing prostate cancer) and in prostate cancer tissue (and in some cases, for 
varying severities of prostate cancer that relate to prognosis, as outlined below) are evaluated 
to provide expression profiles. An expression profile of a particular cell state or point of 
development is essentially a "fingerprint" of the state. While two states may have any 

5 particular gene similarly expressed, the evaluation of a number of genes simultaneously 

allows the generation of a gene expression profile that is reflective of the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 

10 sample has the gene expression profile of normal or cancerous tissue. This will provide for 
molecular diagnosis of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 

15 qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
normal versus prostate cancer tissue. Genes may be turned on or turned off in a particular 
state, relative to another state thus permitting comparison of two or more states. A 
qualitatively regulated gene will exhibit an expression pattern within a state or cell type 
which is detectable by standard techniques. Some genes will be expressed in one state or cell 

20 type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in 
that expression is increased or decreased; i.e., gene expression is either upregulated, resulting 
in an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

25 GeneChip™ expression arrays, Lockhart, Nature Biotechnology 14: 1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

30 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 
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Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the prostate 
5 cancer protein and standard immunoassays (ELISAs, etc.) or other techniques, including 
mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
prostate cancer genes, i.e., those identified as being important in a prostate cancer phenotype, 
can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

10 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. The assays are further described below in the example. PCR techniques 

15 can be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein 
are detected. Although DNA or RNA encoding the prostate cancer protein may be detected, 
of particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

20 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 

25 detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 

30 prostate cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
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secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 

5 assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. Li a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 

10 polypeptides. 

As described and defined herein, prostate cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of prostate cancer. 
Detection of these proteins in putative prostate cancer tissue allows for detection or diagnosis 
of prostate cancer. In one embodiment, antibodies are used to detect prostate cancer proteins. 

15 A preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the prostate cancer protein is 
detected, e.g., by immunoblotting with antibodies raised against the prostate cancer protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

20 hi another preferred method, antibodies to the prostate cancer protein find use 

in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the prostate cancer protein(s). Following washing to remove non- 
specific antibody binding, the presence of the antibody or antibodies is detected. In one 

25 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the prostate cancer protein(s) 
contains a detectable label, e.g. an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
detectable label. This method finds particular use in simultaneous screening for a plurality of 

30 prostate cancer proteins. As will be appreciated by one of ordinary skill in the art, many 
other histological imaging techniques are also provided by the invention. 
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In a preferred embodiment the label is detected in a fluorometer which has the 
ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate 
5 cancer from blood, serum, plasma, stool, and other samples. Such samples, therefore, are 
useful as samples to be probed or tested for the presence of prostate cancer proteins. 
Antibodies can be used to detect a prostate cancer protein by previously described 
immunoassay techniques including ELISA, immunoblotting (western blotting), 
immunoprecipitation, BIACORE technology and the like. Conversely, the presence of 
10 antibodies may indicate an immune response against an endogenous prostate cancer protein. 

In a preferred embodiment, in situ hybridization of labeled prostate cancer 
nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
prostate cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., 
Ausubel, supra) is then performed. When comparing the fingerprints between an individual 
15 and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on 
the findings. It is further understood that the genes which indicate the diagnosis may differ 
from those which indicate the prognosis and molecular profiling of the condition of the cells 
may lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

20 In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 

acids, modified proteins and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer, 
in terms of long term prognosis. Again, this may be done on either a protein or gene level, 
with the use of genes being preferred. As above, prostate cancer probes may. be attached to 

25 biochips for the detection and quantification of prostate cancer sequences in a tissue or 

patient. The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

Assays for therapeutic compounds 
30 In a preferred embodiment members of the proteins, nucleic acids, and 

antibodies as described herein are used in drug screening assays. The prostate cancer 
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proteins, antibodies, nucleic acids, modified proteins and cells containing prostate cancer 
sequences are used in drug screening assays or by evaluating the effect of drug candidates on 
a "gene expression profile" or expression profile of polypeptides. In a preferred embodiment, 
the expression profiles are used, preferably in conjunction with high throughput screening 

5 techniques to allow monitoring for expression profile genes after treatment with a candidate 
agent (e.g., Zlokarrdk, et aL, Science 279:84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 
acids, modified proteins and cells containing the native or modified prostate cancer proteins 
are used in screening assays. That is, the present invention provides novel methods for 

10 screening for compositions which modulate the prostate cancer phenotype or an identified 
physiological function of a prostate cancer protein. As above, this can be done on an 
individual gene level or by evaluating the effect of drug candidates on a "gene expression 
profile". In a preferred embodiment, the expression profiles are used, preferably in 
conjunction with high throughput screening techniques to allow monitoring for expression 

15 profile genes after treatment with a candidate agent, see Zlokarnik, supra. 

Having identified the differentially expressed genes herein, a variety of assays 
may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in prostate cancer, 
test compounds can be screened for the ability to modulate gene expression or for binding to 

20 the prostate cancer protein. "Modulation" thus includes both an increase and a decrease in 
gene expression. The preferred amount of modulation will depend on the original change of 
the gene expression in normal versus tissue undergoing prostate cancer, with changes of at 
least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300- 
1000% or greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue 

25 compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold 
decrease in prostate cancer tissue compared to normal tissue often provides a target value of a 
10-fold increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 

30 be monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 
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immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. . 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entities, i.e., an expression profile, is monitored simultaneously. Such profiles will 
5 typically involve a plurality of those entities described herein.. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, 
may be used with dispensed primers in desired wells. A PCR reaction can then be performed 
10 and analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
sequence set out in Tables 1-16. Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
15 that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 
protein, or interfere with the binding of a prostate cancer protein and an antibody or other 
binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 

20 molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g, a nucleic acid or protein sequence, In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. Li one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g. to a normal tissue 

25 fingerprint In another embodiment, a modulator induced a prostate cancer phenotype. 

Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 
concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

30 Drug candidates encompass numerous chemical classes, though typically they 

are organic molecules, preferably small organic compounds having a molecular weight of 
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more than 100 and less than about 2,500 daltons. Preferred small molecules are less than 
2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents comprise 
functional groups necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, 

5 preferably at least two of me functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 

10 are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer 
protein. By "neutralize" is meant that activity of a protein is inhibited or blocked and the 
consequent effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
1 5 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
20 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
25 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
30 chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
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length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop et al, J. Med Chem. 37(9): 1233-1251 (1994)). 

Preparation and screening of combinatorial chemical libraries is well known to 

5 those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries {see, e.g., U.S. Patent No. 5,010,175, Furka, Pept. Prot. Res. 37:487^93 
(1991), Houghton et al. Nature, 354:84-88 (1991)), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 

10 hydantoins, benzodiazepines and dipeptides (Hobbs et al, Proc. Nat. Acad. Sci. USA 
90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et at, J. Amer. Chem. Soc. 
1 14:6568 (1992)), nonpeptidal pepudomimetics with a Beta-D-Glucose scaffolding 
(ffirschmann et al, J. Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses 
of small compound libraries (Chen etal, J. Amer. Chem. Soc. 116:2661 (1994)), 

15 oligocarbamates (Cho, et al., Science 261:1303 (1993)), and/or peptidyl phosphonates 

(Campbell etal, J. Org. Chem. 59:658 (1994)). See, generally, Gordon et al, J. Med. Chem. 
37:1385 (1994), nucleic acid libraries {see, e.g., Strategene, Corp.), peptide nucleic acid 
libraries {see, e.g., U.S. Patent 5,539,083), antibody libraries {see, e.g., Vaughn et al, Nature 
Biotechnology 14(3):309-314 (1996), and PCT/US96/10287), carbohydrate libraries {see, 

20 e.g., Liang et al., Science 274: 1520-1522 (1996), and U.S. Patent No. 5,593,853), and small 
organic molecule libraries {see, e.g., benzodiazepines, Baum, C&EN, Jan 18, page 33 (1993); 
isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and metathiazanones, US. Patent No. 
5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, 
U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

25 Devices for the preparation of combinatorial libraries are commercially 

available {see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
30 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
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Japan) and many robotic systems utilizing robotic arms (Zymate n, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
with the present invention. The nature and implementation of modifications to these devices 
5 (if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available {see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
Columbia, MD, etc.). 

10 The assays to identify modulators are amenable to high throughput screening. 

Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other 

15 properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

20 throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate entire procedures, including all sample and reagent pipetting, liquid 

25 dispensing, timed incubations, and final readings of the microplate in detectors) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 

30 transcription, ligand binding, and the like. 
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In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention. 

5 Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 
mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Particularly useful test compound will be directed to the class of proteins to which 
the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 

10 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 

15 these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

20 In one embodiment, the library is fully i^domized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, e.g., of hydrophobic amino acids, 

25 hydrophilic residues, sterically biased (either small or large) residues, towards the creation of 
nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 
domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc., or to 
purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined below. As 
30 described above generally for proteins, nucleic acid modulating agents may be naturally 
occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For 
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example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for 
proteins. 

In certain embodiments, the activity of a prostate cancer-associated protein is 
down-regulated, or entirely inhibited, by the use of antisense polynucleotide, Le., a nucleic 
5 acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 

10 naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the prostate cancer 

15 protein mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant 
means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art 

20 Antisense molecules as used herein include antisense or sense 

oligonucleotides. Sense oligonucleotides can, e.g., be employed to block transcription by 
binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for prostate cancer molecules. Antisense or sense 

25 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
is described in, e.g., Stein & Cohen {Cancer Res. 48:2659 (1988 and van der Krol et al. 
(BioTechniques 6:958 (1988)). 

30 In addition to antisense polynucleotides, ribozymes can be used to target and 

inhibit transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an 
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RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes 
have been described, including group I ribozymes, hammerhead ribozymes, hairpin 
ribozymes, RNase P, and axhead ribozymes (see, e.g., Castanotto et al, Adv. in 
Pharmacology 25: 289-317 (1994) for a general review of the properties of different 
5 ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al, 
Nucl Acids Res. 18:299-304 (1990); European Patent Publication No. 0 360 257; U.S. Patent 
No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., 
WO 94/26877; Ojwang et al, Proc. Natl. Acad. Sci. USA 90:6340-6344 (1993); Yamada et 

10 al., Human Gene Therapy 1:39-45 (1994); Leavitt et al., Proc. Natl Acad Sci. USA 92:699- 
703 (1995); Leavitt etal, Human Gene Therapy 5:1151-120 (1994); and Yamada etal., 
Virology 205: 121-126 (1994)). 

Polynucleotide modulators of prostate cancer may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 

15 molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 
bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 

20 or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment 

25 As noted above, gene expression monitoring is conveniently used to test 

candidate modultors (e.g., protein, nucleic acid or small molecule). After the candidate agent 
has been added and the cells allowed to incubate for some period of time, the sample 
containing a target sequence to be analyzed is added to the biochip. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 

30 lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 

amplification such as PCR performed as appropriate. For example, an in vitro transcription 
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with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a 
fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of 
5 detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 
such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 
appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 

10 epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 

15 probes, as is generally outlined in U.S. Patent Nos. 5,681 ,702, 5,597,909, 5,545,730, 

5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. Li this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 

20 conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 

25 altering a step parameter that is a thermodynamic variable, including, but not limited to, 
temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally oudined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 

30 steps at higher stringency conditions to reduce non-specific binding. 
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The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. Li addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 

5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target 

The assay data are analyzed to determine the expression levels, and changes in 

10 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. Li another embodiment, e.g., for diagnostic applications, having identified 

15 differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. Ih an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 

20 and/or modulate the biological activity of the gene product. 

Li addition screens can be done for genes that are induced in response to a 
candidate agent. After identifying a modulator based upon its ability to suppress a prostate 
cancer expression pattern leading to a normal expression pattern, or to modulate a single 
prostate cancer gene expression profile so as to mimic the expression of the gene from 

25 normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated prostate cancer tissue reveals genes that are not expressed in 
normal tissue or prostate cancer tissue, but are expressed in agent treated tissue. These agent- 
specific sequences can be identified and used by methods described herein for prostate cancer 

30 genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
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agent induced proteins and used to target novel therapeutics to the treated prostate cancer 
tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of 
prostate cancer cells, that have an associated prostate cancer expression profile. By 

5 "adrninistration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 
encoding a proteinaceous candidate agent (i.e., a peptide) maybe put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 

10 the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 

15 generated, as outlined herein. 

Thus, e.g., prostate cancer tissue may be screened for agents that modulate, 
e.g., induce or suppress the prostate cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on prostate 
cancer activity. By defining such a signature for the prostate cancer phenotype, screens for 

20 new drugs that alter the phenotype can be devised. With this approach, the drug target need 
not be known and need not be represented in the original expression screening platform, nor 
does the level of transcript for the target protein need to change. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 

25 differentially expressed gene as important in a particular state, screening of modulators of 

either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 

30 acids of Tables 1-16. Preferably, the prostate cancer modulatory protein is a fragment In a 
preferred embodiment, the prostate cancer amino acid sequence which is used to determine 
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sequence identity or similarity is encoded by a nucleic acid of Tables 1-16. In another 
embodiment, the sequences are naturally occurring allelic variants of a protein encoded by a 
nucleic acid of Tables 1-16. In another embodiment, the sequences are sequence variants as 
further described herein. 

5 Preferably, the prostate cancer modulatory protein is a fragment of 

approximately 14 to 24 amino acids long. More preferably the fragment is a soluble 
fragment Preferably, the fragment includes a non-transmembrane region. In a preferred 
embodiment, the fragment has an N-terminal Cys to aid in solubility. In one embodiment, the 
C-terminus of the fragment is kept as a free acid and the N-terminus is a free amine to aid in 

10 coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the prostate cancer protein is 
conjugated to BSA. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or 

15 the prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 
measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 

20 animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 
release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth orpH changes, and changes 
in intracellular second messengers such as cGMP. In the assays of the invention, mammalian 

25 prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in 
vitro. For example, a prostate cancer polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, 
the prostate cancer polypeptide levels are determined in vitro by measuring the level of 

30 protein or mRNA. The level of protein is measured using immunoassays such as western 
blotting, FJJSA and the like with an antibody that selectively binds to the prostate cancer 
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polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 
PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 
blotting, are preferred The level of protein or mRNA is detected using directly or indirectly 
labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 

5 radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or 0-gal. The reporter construct is typically transfected into a cell. After 
treatment with a potential modulator, the amount of reporter gene transcription, translation, or 

10 activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 
expression of the gene or the gene product itself can be done. The gene products of 

15 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins." 
The prostate cancer protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

Li one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 

20 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 

25 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate 

30 cancer protein and a candidate compound, and determining the binding of the compound to 
the prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
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although other mammalian proteins may also be used, e.g. for the development of animal 
models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate 

5 cancer protein or the candidate agent is non-diffusably bound to an insoluble support having 
isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble 
supports may be made of any composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any convenient shape. 

10 Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition is not crucial so long as it 

15 is compatible with the reagents and overall methods of the invention, maintains the activity of 
the composition and is nondiffusable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, 
chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following 

20 binding of the protein or agent, excess unbound material is removed by washing. The sample 
receiving areas may then be blocked through incubation with bovine serum albumin (BSA), 
casein or other innocuous protein or other moiety. 

Li a preferred embodiment, the prostate cancer protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 

25 support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 

30 protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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Hie determination of the binding of the test modulating compound to the 
prostate cancer protein may be done in a number of ways. In a preferred embodiment, the 
compound is labeled, and binding determined directly, e.g., by attaching all or a portion of 
the prostate cancer protein to a solid support, adding a labeled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support. Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the 
proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., l25 I for the proteins and a fluorophor 
10 for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by 
competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e., a prostate cancer protein), such as an antibody, peptide, binding partner, 

15 ligand,etc. Under certain circumstances, mere may be competitive binding between the 

compound and the binding moiety, with the binding moiety displacing the compound. In one 
embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 

20 40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 
removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 

25 compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent Alternatively, if the test compound is labeled, the 

30 presence of the label on the support indicates displacement. 
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In an alternative embodiment, the test compound is added first, with 
incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the prostate cancer protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 

5 support, coupled with a lack of competitor binding, may indicate that the test compound is 
capable of binding to the prostate cancer protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activity of the prostate cancer proteins. In 
this embodiment, the methods comprise combining a prostate cancer protein and a competitor 

10 in a first sample. A second sample comprises a test compound, a prostate cancer protein, and 
a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

15 agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that 
bind to the native prostate cancer protein, but cannot bind to modified prostate cancer 
proteins. The structure of the prostate cancer protein may be modeled, and used in rational 
drug design to synthesize agents that interact with that site. Drug candidates that affect the 

20 activity of a prostate cancer protein are also identified by screening drugs for the ability to 
either enhance or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 

25 protein. Following incubation, samples are washed free of non-specifically bound material 
and the amount of bound, generally labeled agent determined. For example, where a 
radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 

30 include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
to facilitate optimal protein-protein binding and/or reduce non-specific or background 
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interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 

5 compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

10 In one aspect, the assays are evaluated in the presence or absence or previous 

or subsequent exposure of physiological signals, e.g. hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
. example, the determinations are determined at different stages of the cell cycle process. 

15 In this way, compounds that modulate prostate cancer agents are identified. 

Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is 

20 provided. The method comprises administration of a prostate cancer inhibitor. In another 
embodiment, a method of inhibiting prostate cancer is provided. The method comprises 
administration of a prostate cancer inhibitor. In a further embodiment, methods of treating 
cells or individuals with prostate cancer are provided. The method comprises administtation 
of a prostate cancer inhibitor. 

25 In one embodiment, a prostate cancer inhibitor is an antibody as discussed 

above. In another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

30 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
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transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
5 modulators of prostate cancer sequences, which when expressed in host cells, inhibit 

abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft 

Techniques for soft agar growth or colony formation in suspension assays are 
10 described in Preshney, Culture of Animal Cells a Manual of Basic Technique (3 rf ed., 1994), 
herein incorporated by reference.- See also, the methods section of Garkavtsev et at (1996), 
supra, herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until 
15 they touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
20 pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 

saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

Li this assay, labeling index with ( 3 H>thymidine at saturation density is a 
25 preferred method of measuring density limitation of growth. Transformed host cells are 
transfected with a prostate cancer-associated sequence and are grown for 24 hours at 
saturation density in non-limiting medium conditions. The percentage of cells labeling with 
( 3 H)-thymidine is determined autoradiographically. See, Freshney (1994), supra. 
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Growth factor or serum dependence 

Transformed cells have a lower serum dependence than their normal 
counterparts (see, e.g., Temin, /. Natl. Cancer InstL 37:167-175 (1966); Eagle et al., J. Exp. 
Med 131:836-879 (1970)); Freshney, supra. This is in part due to release of various growth 
5 factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

Tumor specific markers levels 

TumOr cells release an increased amount of certain factors (hereinafter "tumor 
10 specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 
Gullino, Angiogenesis, tumor vascularization, and potential interference with tumor growth. 
in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985)). Similarly, Tumor 
angiogenesis factor (TAP) is released at a higher level in tumor cells than their normal 
15 counterparts. See, e.g., Folkman, Angiogenesis and Cancer, Sem Cancer Biol. (1992)). 

Various techniques which measure the release of these factors are described in 
Freshney (1994), supra. Also, see, Unkless et al. , J. Biol. Chem. 249:4295-4305 (1974); 
Strickland & Beers, J. Biol. Chem. 251:5694-5702 (1976); Whur etal., Br. J. Cancer 42:305- 
312 (1980); Gullino, Angiogenesis, tumor vascularization, and potential interference with 
20 tumor growth, in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985); 
Freshney Anticancer Res. 5:111-130 (1985). 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel-or some other extracellular matrix 
25 constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. Li this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
30 Techniques described in Freshney (1994), supra, can be used. Briefly, the 

level of invasion of host cells can be measured by using filters coated with Matrigel or some 
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other extracellular matrix constituent Penetration into the gel, or through to the distal side of 
the filter, is rated as invasiveness, and rated histologically by number of cells and distance 
moved, or by prelabeling the cells with l25 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

5 

Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted. Knock- 

10 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

15 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

20 lesion (see, e.g., Capecchi et al, Science 244: 1288 (1989)). Chimeric targeted mice can be 
derived according to Hogan et al. Manipulating the Mouse Embryo: A Laboratory Manual, 
Cold Spring Harbor Laboratory (1988) and Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed., JRL Press, Washington, D.C., (1987). 

Alternatively, various immune-suppressed or immune-deficient host animals 

25 can be used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella et al., J. 
Natl Cancer Inst. 52:921 (1974)), a SCID mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley et al., Br. J. Cancer 38:263 (1978); Selby et al., Br. J. Cancer 
41:52 (1980)) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

30 normal cells of similar origin will not. In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
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suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 

S Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer 
sequences is correlated with prostate cancer. Accordingly, disorders based on mutant or 
variant prostate cancer genes may be determined. In one embodiment, the invention provides 
methods for identifying cells containing variant prostate cancer genes, e.g., determining all or 

10 part of the sequence of at least one endogenous prostate cancer genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the prostate cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one prostate cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 

15 of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced prostate cancer gene to a known prostate cancer 
gene, i.e., a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared 
to the sequence of a known prostate cancer gene to determine if any differences exist This 

20 can be done using any number of known homology programs, such as Bestfit, etc. In a 
preferred embodiment, the presence of a difference in the sequence between the prostate 
cancer gene of the patient and the known prostate cancer gene correlates with a disease state 
or a propensity for a disease state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to 

25 determine the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes 
to determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 

30 cancer gene locus. 
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Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer 
protein or modulator thereof, is administered to a patient By "therapeutically effective dose" 
herein is meant a dose that produces effects for which it is administered The exact dose will 
5 depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques (e.g., Ansel et aL, Pharmaceutical Dosage Forms and Drug 
Delivery; Lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992), Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd, The Art, Science and 
Technology of Pharmaceutical Compounding (1999); and Pickar, Dosage Calculations 

10 (1999)). As is known in the art, adjustments for prostate cancer degradation, systemic versus 
localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction and the severity of the 
condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576, further discloses the use of 

15 compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 

A "patient" for the purposes of the present invention includes both humans 
and other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 

20 preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the prostate cancer proteins and modulators thereof of 
the present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

25 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a prostate 
cancer protein in a form suitable for administration to a patient. In the preferred embodiment, 
the pharmaceutical compositions are in a water soluble form, such as being present as 

30 pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. 'Tharmaceutically acceptable acid addition salt" refers to those salts that retain the 

83 



WO 02/30268 



PCT/US01/32045 



biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 

5 acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

15 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as rnicrocrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate 
cancer protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an 

30 aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. 
These solutions are sterile and generally free of undesirable matter. These compositions may 
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be sterilized by conventional, well known sterilization techniques. The compositions may 
contain phannaceutically acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 
5 sodium lactate and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., Remington's Pharmaceutical Science (15th ed., 1980) and Goodman & Gillman, 
The Pharmacologial Basis of Therapeutics (Hardman et al.,eds., 1996)). 

10 Thus, a typical pharmaceutical composition for intravenous administration 

would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 

15 preparing parenterally administrable compositions will be known or apparent to those skilled 
in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 

20 compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially arrest the disease and its complications. An 
amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 

25 depending on the dosage and frequency as required and tolerated by the patient. In any event, 
the composition should provide a sufficient quantity of the agents of this invention to 
effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 

30 condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, adrninistration route, efficiency, etc. Such prophylactic 
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treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 

It will be appreciated that the present prostate cancer protein-modulating 
5 compounds can be administered alone or in combination with additional prostate cancer 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1-16, such as antisense polynucleotides 

10 or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of prostate cancer-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 

15 expression of a protein or nucleic acid is application specific. Many procedures for 

introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, 

20 e.g., Berger & Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 (Berger), Ausubel et al, eds., Current Protocols (supplemented through 1999), 
and Sambrook et al.. Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 1-3, 1989. 

In a preferred embodiment, prostate cancer proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 prostate cancer genes (including both the full-length sequence, partial sequences, or 

regulatory sequences of the prostate cancer coding regions) can be administered in a gene 
therapy application. These prostate cancer genes can include antisense applications, either as 
gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Prostate cancer polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
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compositions can include, e.g., lipidated peptides {see, e.g.,Vitiello, A. et al., J. Clin. Invest. 
95:341 (1995)), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g., Eldridge, et al, Molec. Immunol. 28:287-294, (1991); Alonso et al., 
Vaccine 12:299-306 (1994); Jones etaL, Vaccine 13:675-681 (1995)), peptide compositions 
5 contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et al., Nature 
344:873-875 (1990); Hu etal., Clin Exp Immunol. 113:235-243 (1998)), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn, Proc. Natl. Acad. Sci. U.SA. 85:5409-5413 (1988); 
Tam, /. Immunol. Methods 196:17-32 (1996)), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 

10 vectors (Perkus, et al., In: Concepts in vaccine development (Kaufmann, ed., p. 379, 1996); 
Chakrabarti, et al., Nature 320:535 (1986); Hu et al., Nature 320:537 (1986); Kieny, et al, 
AIDS Bio/Technology 4:790 (1986); Top et al, J. Infect. Dis. 124: 148 (1971); Chanda et al., 
Virology 175:535 (1990)), particles of viral or synthetic origin (see, e.g., Kofler et al., J. 
Immunol. Methods. 192:25 (1996); Eldridge et al, Sem. Hematol. 30:16 (1993); Falo et al, 

15 Nature Med 7:649 (1995)), adjuvants (Warren et al., Anna. Rev. Immunol 4:369 (1986); 
Gupta etal., Vaccine 11:293 (1993)), liposomes (Reddy etal, J. Immunol. 148:1585 (1992); 
Rock, Immunol. Today 17:131 (1996)), or, naked or particle absorbed cDNA (Ulmer, et al, 
Science 259:1745 (1993); Robinson et al., Vaccine 11:957 (1993); Shiver et al., In: Concepts 
in vaccine development (Kaufmann, ed., p. 423, 1996); Cease & Berzofsky, Annu. Rev. 

20 Immunol. 12:923 (1994) and Eldridge et al, Sem. Hematol 30:16(1993)). Toxin-targeted 
delivery technologies, also known as receptor mediated targeting, such as those of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 

25 or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

30 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
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polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or 

5 RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient. This approach is described, for instance, in Wolff et. aL, Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 

10 cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
{see, e.g„ U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 
invention can be expressed by viral or bacterial vectors. Examples of expression vectors 
include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 

15 vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 

20 described in Stover et aL, Nature 351:456-460 (1991). A wide variety of other vectors useful 
for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, 
retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, 
will be apparent to those skilled in the art from the description herein (see, e.g., Shata et aL, 
Mol Med Today 6:66-71 (2000); Shedlock etal.,J Leukoc Biol 68:793-806 (2000); Hipp et 

25 al., In Vivo 14:571-85 (2000)). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing a prostate cancer gene or portion of a prostate cancer gene under the control of a 
regulatable promoter or a tissue-specific promoter for expression in a prostate cancer patient. 
The prostate cancer gene used for DNA vaccines can encode full-length prostate cancer 

30 proteins, but more preferably encodes portions of the prostate cancer proteins including 
peptides derived from the prostate cancer protein. In one embodiment, a patient is 
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immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from 
a prostate cancer gene. For example, prostate cancer-associated genes or sequence encoding 
subfragments of a prostate cancer protein are introduced into expression vectors and tested 
for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T 
5 cell responses. This procedure provides for production of cytotoxic T cell responses against 
cells which present antigen, including intracellular epitopes. 

In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the prostate cancer polypeptide encoded by the DNA 

10 vaccine. Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating 
animal models of prostate cancer. When the prostate cancer gene identified is repressed or 
diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 

15 models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g. as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 

20 cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate 
cancer. As such, transgenic animals can be generated that overexpress the prostate cancer 
protein. Depending on the desired expression level, promoters of various strengths can be 
employed to express the transgene. Also, the number of copies of the integrated transgene 

25 can be determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 

30 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
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may include any or all of the following: assay reagents, buffers, prostate cancer-specific 
nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 
molecules inhibitors of prostate cancer-associated sequences etc. A therapeutic product may 

5 include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

Li addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
medium capable of storing such instructions and communicating them to an end user is 

10 contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
materials. 

The present invention also provides for kits for screening for modulators of 
15 prostate cancer-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: a prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and 
instructions for testing prostate cancer-associated activity. Optionally, the kit contains 
biologically active prostate cancer protein. A wide variety of kits and components can be 
20 prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 
parameters in disease which may be identified in historical or outcome data. 

25 
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EXAMPLES 

Example 1: Tissue Preparation, Labeling Chips, and Fingerprints 

5 Purifying total RNA from tissue sample using TRIzol Reagent 

The sample weight is first estimated. The tissue samples are homogenized in 
1 ml of TRIzol per 50 mg of tissue using a homogenizer (e.g., Polytron 3 100). The size of 
the generator/probe used depends upon the sample amount A generator that is too large for 
the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield A 

10 larger generator (e.g., 20 mm) is suitable for tissue samples weighing more than 0.6 g. Fill 
tubes should not be overfilled If the working volume is greater than 2 ml and no greater than 
10 ml, a 15 ml polypropylene tube (Falcon 2059) is suitable for homogenization. 

Tissues should be kept frozen until homogenized. The TRIzol is added 
directly to the frozen tissue before homogenization. Following homogenization, the insoluble 

15 material is removed from the homogenate by centrifugation at 7500 x g for 15 min. in a 

Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf centrifuge at 4°C. The cleared 
homogenate is then transferred to a new tube(s). Samples may be frozen and stored at -60 to 

-70°C for at least one month or else continue with the purification. 

The next process is phase separation. The homogenized samples are incubated 

20 for 5 minutes at room temperature. Then, 0.2 ml of chloroform per 1ml of TRIzol reagent is 
added to the homogenization mixture. The tubes are securely capped and shaken vigorously 
by hand (do not vortex) for 15 seconds. The samples are then incubated at room temp, for 
2-3 minutes and next centrifuged at 6500 rpm in a Sorvall superspeed for 30 min. at 4oC. 

The next process is RNA Precipitation. The aqueous phase is transferred to a 

25 fresh tube. The organic phase can be saved if isolation of DNA or protein is desired. Then 
0.5 ml of isopropyl alcohol is added per 1ml of TRIzol reagent used in the original 
homogenization. Then, the tubes are securely capped and inverted to mix. The samples are 
then incubated at room temp, for 10 minutes an centrifuged at 6500 rpm in Sorvall for 20 

min. at4°C. 
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Hie RNA is then washed. Hie supernatant is poured off and the pellet washed 
with cold 75% ethanol. 1 ml of 75% ethanol is used per 1 ml of the TRIzol reagent used in 
the initial homogenization. The tubes are capped securely and inverted several times to 
loosen pellet without vortexing . They are next centrifuged at <8000 rpm(<7500xg) for 5 
5 minutes at 4°C. 

The RNA wash is decanted. The pellet is carefully transferred to an 
Eppendorf tube (sliding down the tube into the new tube by use of a pipet tip to help guide it 
in if necessary). Tube(s) sizes for precipitating the RNA depending on the working volumes. 
Larger tubes may take too long to dry. Dry pellet The RNA is then resuspended in an 
10 appropriate volume (e.g., 2 -5 ug/ul) of DEPC H2O. The absorbance is then measured. 

The poly A+ mRNA may next be purified from total RNA by other methods 
such as Qiagen' s RNeasy kit. The poly A + mRNA is purified from total RNA by adding the 
oligotex suspension which has been heated to 37°C and mixing prior to adding to RNA. 
The Elution Buffer is incubated at 70°C. If there is precipitate in the buffer, warm up the 2 x 
15 Binding Buffer at 65°C. The the total RNA is mixed with DEPC-treated water, 2 x Binding 
Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook and next 
incubated for 3 minutes at 65°C and 10 minutes at room temperature. 

The preparation is centrifuged for 2 minutes at 14,000 to 18,000 g, preferably, 
at a "soft setting," The supernatant is removed without disturbing Oligotex pellet. A little bit 
20 of solution can be left behind to reduce the loss of Oligotex. The supernatant is saved until 

satisfactory binding and elution of poly A + mRNA has been found. 

Then, the preparation is gently resuspended in Wash Buffer O W2 and pipetted 
onto the spin column and centrifuged at full speed (soft .setting if possible) for 1 minute. 

Next, the spin column is transferred to a new collection tube and gently 
25 resuspended in Wash Buffer OW2 and centrifuged as described herein. 

Then, the spin column is transferred to a new tube and eluted with 20 to 100 ul 

of preheated (70°C) Elution Buffer. The Oligotex resin is gently resuspended by pipetting up 
and down. The centrifugation is repeated as above and the elution repeated with fresh elution 
buffer or first eluate to keep the elution volume low. 
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The absorbance is next read to determine the yield, using diluted Elution 
Buffer as the blank. 

Before proceeding with cDNA synthesis, the mRNA is precipitated before 
proceeding with cDNA synthesis, as components leftover or in the Elution Buffer from the 
5 Oligotex purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol is added and the preparation 
precipitated at -20°C 1 hour to overnight (or 20-30 min. at -70°C), and centrifuged at 
14,000-16,000 x g for 30 minutes at 4°C. Next, the pellet is washed with 0.5 ml of 80% 
ethanol (-20°C) and then centrifuged at 14,000-16,000 x g for 5 minutes at room temperature. 
10 The80% ethanol wash is then repeated. The last bit of ethanol from the pellet is then dried 
without use of a speed vacuum and the pellet is then resuspended in DEPC H 2 0 at lug/ul 
concentration. 

Alternatively the RNA may be purified usine other methods (e.g.. Oiagen's RNeasv kit). 

15 No more than 100 ug is added to the RNeasy column. The sample volume is 

adjusted to 100 ul with RNase-free water. 350 ul Buffer RLT and then 250 ul ethanol 
(100%) are added to the sample. The preparation is then mixed by pipetting and applied to an 
RNeasy mini spin column for centrifugation (15 sec at >10,000 rpm). If yield is low, reapply 
the flowthrough to the column and centrifuge again. 

20 Then, transfer column to a new 2 ml collection tube and add 500 ul Buffer 

RPE and centrifuge for 15 sec at >10,000 rpm. The flowthrough is. discarded. 500 ul Buffer 
RPE and is then added and the preparation is centriuged for 15 sec at >10,000 rpm. The 
flowthrough is discarded, and the column membrane dried by centrifuging for 2 min at 
maximum speed. The column is transferred to a new 1.5-ml collection tube. 30-50 ul of 

25 RNase-free water is applied directly onto column membrane. The column is then centrifuged 
for 1 min at >10,000 rpm and the elution step repeated. 

The absorbance is then read to determine yield. If necessary, the material may 
be ethanol precipitated with ammonium acetate and 2.5X volume 100% ethanol. 

30 First Strand cDNA Synthesis 
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The first strand can be make using using Gibco's "Superscript Choice System 
for cDNA Synthesis" kit. The starting material is 5 ug of total RNA or 1 ug of polyAf 
mRNAl. For total RNA, 2 ul of Superscript RT is used; for polyA+ mRNA, 1 ul of 
Superscript RT is used. The final volume of first strand synthesis mix is 20 ul. The RNA 
5 should be in a volume no greater than 10 ul. The RNA is incubated with 1 ul of 100 pmol 
T7-T24 oligo for 10 min at 70°C followed by addition on ice of 7 ul of: 4ul 5X 1 st Strand 
Buffer, 2 ul of 0. 1M DTT, and 1 ul of lOmM dNTP mix. The preparation is then incubated at 
37°C for 2 min before addition of the Superscript RT followed by incubation at 37°C for 1 
hour. 

10 

Second Strand Synthesis 

For the second strand synthesis, place 1st strand reactions on ice and add: 91 
ul DEPC H 2 0; 30 ul 5X 2nd Strand Buffer; 3 ul lOmM dNTP mix; 1 ul 10 U/ul Rcoli DNA 
Ligase; 4 ul 10 U/ul Rcoli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 
15 hours at 16°C. Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16°C. Add 10 ul of 0.5M 
EDTA. 

Cleaning up cDNA 

The cDNA is purified using Phenol:Chloroform:Isoamyl Alcohol (25:24:1) 

20 and Phase-Lock gel tubes. The PLG tubes are centrifuged for 30 sec at maximum speed. 
The cDNA mix is then transferred to PLG tube. An equal volume of 
phenol:chloroform:isamyl alcohol is then added, the preparation shaken vigorously (no 
vortexing), and centrifuged for 5 minutes at maximum speed. The top aqueous solution is 
transferred to a new tube and ethanol precipitated by adding 7.5X 5M NH40Ac and 2.5X 

25 volume of 100% ethanol. Next, it is centrifuged immediately at room temperature for 20 
min, maximum speed. The supernatant is removed, and the pellet washed with 2X with cold 
80% ethanol. As much ethanol wash as possible should be removed before air drying the 
pellet; and resuspending it in 3 ul RNase-free water. 
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In vitro Transcription (TVT) and labeling with biotin 

In vitro Transcription (IVT) and labeling with biotin is performed as follows: 
Pipet 1.5 ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2 ul 
T7 lOxATP (75 mM) (Ambion); 2 ul T7 lOxGTP (75 mM) (Ambion); 1.5 ul T7 lOxCTP (75 
5 mM) (Ambion); 1.5 ul T7 lOxUTP (75 mM) (Ambion); 3.75 ul 10 mM Bio-ll-UTP 
(Boehringer-Manrdieirn/Roche or Enzo); 3.75 ul 10 mM Bio-16-CTP (Enzo); 2 ul lOx T7 
transcription buffer (Ambion); and 2 ul lOx T7 enzyme mix (Ambion). The final volume is 
20 ul. Incubate 6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 
Clean-up follows the previous instructions for RNeasy columns or Qiagen's RNeasy protocol 

10 handbook. The cRNA often needs to be ethanol precipitated by resuspension in a volume 
compatible with the fragmentation step. 

Fragmentation is performed as follows. 15 ug of labeled RNA is usually 
fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right- Do not go higher than 20 ul because the magnesium in 

15 the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 
buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 

20 of the transcript size range. 

For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50 ng/ul final cone); 50 pM 948-b control 

25 oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0. 1 mg/ml herring sperm DNA; 
0.5 mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

The hybridization reaction is conducted with non-biotinylated rvT (purified 
by RNeasy columns) (see example 1 for steps from tissue to IVT): The following mixture is 
prepared: 
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WT antisense RNA; 4 ug: ul 
Random Hexamers (1 ug/ul): 4 ul 
H 2 0: ul 

14 ul 

5 Incubate the above 14 ul mixture at 70°C for 10 min.; then put on ice. 

The Reverse transcription procedure uses the following mixture: 

0.1 M DTP. 3ul 

50XdKTPmix: 0.6 yl 

H 2 0: 2.4 Ml 

10 Cy3 or Cy5 dUTP (ImM): 3 ul 

SS RT II (BRL): 1 ul 



The above solution is added to the hybridization reaction and incubated for 30 min., 42°C. 
15 Then, 1 ul SSII is added and incubated for another hour before being placed on ice. 

The 50X dNTP mix contains 25mM of cold dATP, dCTP, and dGTP, lOmM 

of dTTP and is made by adding 25 ul each of lOOmM dATP, dCTP, and dGTP; 10 ul of 

lOOmM dTTP to 15 ul H 2 0. ] 

RNA degradation is performed as follows. Add 86 ul H20, 1.5 ul 1M NaOH/ 
20 2 mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 ul TE/sample spin at 7000 g 

for 10 min, save flow through for purification. For Qiagen purification, suspend u-con 

recovered material in 500 ul buffer PB and proceed using Qiagen protocol. For DNAse 

digestion, add 1 ul of 1/100 dilution of DNAse/30 ul Rx and incubate at 37°C for 15 min. 

Incubate at 5 min 95°C to denature the DNAse. 

25 

Sample preparation 

For sample preparation, add Cot-1 DNA, 10 ul; 50X dNTPs, 1 ul; 20X SSC, 
2.3 ul; Na pyro phosphate, 7.5 ul; 10 mg/ml Herring sperm DNA; 1 ul of 1/10 dilution to 
21.8 final vol. Dry in speed vac. Resuspend in 15 ul H 2 0. Add 0.38 ul 10% SDS. Heat 
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95°C, 2 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 
64°C. Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 ml 20X 
SSC40.75ml 10% SDS in 250ml H z O; IX SSC: 5 min., 12.5 ml 20X SSC in 250ml H 2 0; 
0.2X SSC: 5 min., 2.5 ml 20X SSC in 250ml H 2 0. Dry slides and scan at appropiate PMT's 
S and channels. 

Example 2: Taxol resistant Xenograft Model of Human Prostate Cancer 

Treatment regimens that include paclitaxel (Taxol; Bristol-Myers Squibb 
10 Company, Princeton, NJ) have been particularly successful in treating hormone-refractory 
prostate cancer in the phase II setting (Smith et al., Semin. Oncol. 26(1 Suppl 2):109-11 
(1999)). However, many patients develop tumors which are initially, or later become, 
resistant to taxol. To identify genes mat may be involved with resistance to taxol, or are 
regulated in response to taxol resistance, and therefore may be used to treat, or identify, taxol 
15 resistance in patients, the following experiments were carried out 

The androgen-independent human cell line CWR22R was grown as a 
xenograft in nude mice (Nagabhushan et al., Cancer Res. 56(13):3042-3046 (1996); Agus et 
al., J. Nad. Cancer Inst.91(21): 1869-1876 (1999); Bubendorf et al., J. Nad. Cancer Inst. 

20 91(20):1758-1764 (1999)). Initially, these xenograft tumors were sensitive to therapeutic 
doses of taxol. The mice were treated continuously with sub-therapeutic doses, and the 
tumors were allowed to grow for 3-4 weeks, before surgical removal of the tumors. The 
tumor from an individual mouse was then minced, and a small portion was then injected into 
a healthy nude mouse, establishing a second 

25 passage of the tumor. This mouse was then treated continuously with the 

same sub-therapeutic dose of taxol. This process was repeated 14 times, and a portion of 
each generation of xenograft tumor was collected. There was increasing resistance to 
therapeutic doses of taxol with each generation. Bythe end of the process, the tumors were 
fully resistant to therapeutic doses of taxol. RNA from each generation of tumor was then 

30 isolated, and individual mRNA species were quantified using a custom Affymetrix 

GeneChip® oligonucleotide microarray, with probes to interrogate approximately 35,000 
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unique mRNA transcripts. Genes were selected that showed a statistically significant up- 
regulation, or down-regulation, during die subsequent generations of increasingly taxol- 
resistant tumors. Only one gene was significandy up-regulated, whereas 24 genes were 
down-regulated; these are presented in Table 10. 
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The gene sequences identified to be overexpressed in prostate cancer may be 
used to identify coding regions from the public DNA database. The sequences may be used 

5 to either identify genes that encode known proteins, or they may be used to predict the coding 
regions from genomic DNA using exon prediction algorithms, such as FGENESH (Salamov 
and Solovyev, 2000, Genome Res. 10:516-522). In addition, one of ordinary skill in the art 
would understand how to obtain the unigene cluster identification and sequence information 
according to the exemplar accession numbers provided in Tables 1-16. (see, 

10 http^/www.ncbi.nlm.nih.gov/UniGene/). 



15 
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TABLE1 : shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos HuOl 
5 GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



Pkey: Unique Eos probesst Identifier number 

10 ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unkjene number 

Untgene Title: Undone gene title 

R1: Rafioof tumor to normal body tissue 



Pkey UnigenelD ExAccn Unlngene Title R1 

131919 H&272458 AA121266 ESTs 37.2 

120328 Hs290905 AA196979 ESTs; Weakly similar to (deflme not ava 325 

20 105201 Hs51412 M195626 ESTs 30.1 

101486 Hs.1852 M24902 add phosphatase; prostate 252 

119073 H&279477 R32894 ESTs 245 

133428 H&183752 M34376 rrdcrosemlnoproteh; bala- 235 

128180 Hs.171995 AA59S348 kaDikreln 3; (prostate specific antigen 2U 

25 104080 Hs57771 AA402971 Homo sapiens mRNA for serine protease (T 185 

127537 Hs.162859 AA569531 ESTs 185 

131665 Hs.30343 R22139 ESTs 17/ 

101050 Hs.1832 K01911 neuropeptideY 175 

130771 Hs.1915 N48056 folate hydrolase (prostate-specific memb 17 

30 108153 Hs.40808 AA054237 ESTs 165 

107485 Hs.262476 W63793 S-adenosytmethionine decarboxylase 1 16.7 

106155 Hs.33287 AA425309 ESTs 165 

129534 Hs.11260 R73640 ESTs 16.4 

100569 Hs.171995 HG2261-HT2351 Antigen, Prostate Specific, Aft. SpEce 16 

35 101889 Hs.181350 S39329 Wkrein 2; prostatic 15.4 

135389 Hs59872 U05237 fetal Alzheimer arrtigen 15 

101508 Hs52192 M27436 coagulaSon factor ill (thromboplastin; 135 

134374 Hs.8236 D62633 ESTs 12.7 

133944 Hs.7780 AA045870 ESTs 125 

40 109141 Hs.193380 AA176428 ESTs 125 

130974 H&2178 X57985 H2B hlstone family; member Q 115 

114768 Hs.182339 AA149007 ESTs 115 

104394 Hs.172129 H46617 yp19M/1 Scares breast SNbHBst Homo sap 115 

125299 Hs.102720 Z39436 ESTs 115 

45 104660 Hs.14846 AA007160 ESTs 11/ 

100116 Hs.78045 DO0654 acfimgamma 2; smooth muscle; enteric 11 

131061 Hs£68744 N64328 ESTs; Moderately similar to KIAA0273 (R 105 

126645 126645 AI167942 Homo sapiens BAG done RG041D1 1 from 7q2 10.7 

135153 Hs55420 N40141 Homo sapiens mRNA for JM27 protein; oomp 10.6 

50 107033 Hs.1 13314 AA599629 ESTs - 10.6 

118417 N66048 ESTs; Weakly similar to polymerase [Hsa 105 

126758 Hs.293960 W37145 ESTs 102 

115674 Hs.8364 AA406542 ESTs 10.1 

134989 Hs.92381 AA236324 ESTs; Weakly similar to IIS ALU CLASS A 10.1 

55 107102 HS50652 AA609723 ESTs 10.1 

116787 Hs.15641 H28581 ESTs 10.1 

115719 HS59622 AA416997 ESTs 10 

123209 Hs.203270 AA489711 ESTs 9.9 

101664 Hs.121017 M60752 H2Ahistone family; member A 95 

60 112971 Hs.83883 T17185 ESTs 9.7 

102519 Hs.80296 U52969 Purtdnje cell protein 4 9.7 

117984 Hs.106778 N51919 ESTs 9.7 

105840 H&22209 AA398533 ESTs 9.4 

129523 Hs.274509 M30894 T-cell receptor, gamma duster 9.4 

65 132984 Hs.167133 AA031360 ESTs 9.2 

121853 Hs.98502 AA425887 ESTs 9 
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115764 H&91011 AA421562 antartor gradient 2 (Xertopus laevis; sec 85 

119817 Hs55999 W47380 ESTs 85 

100552 H&301946 HG2167-HT2237 Protein Kinase Ht31 , Camp-Dependent 85 

105627 H&23317 AA281245 ESTs 85 

5 101461 Hs.76422 M22430 phaphoifease A2; group HA (platelets; 8.7 

131725 Hs51146 AA45S264 ESTs; Highly similar to (defllne not ava 85 

124526 Hs593185 N62098 yz61c5x1 Soaresjnutlipla_sctetosis_2NbH 85 

118528 Hs.49397 N67889 ESTs 85 

133845 Hs.76704 T68510 ESTs 85 

10 133354 HS534762 AA055552 ESTs;WeakrysMlartoKIAA<B19[H.sapi 8.1 

105912 Hs50415 AA402000 ESTs; Weakly similar to GS3786 [H^aplen 8 

119018 HS578S95 N95796 ESTs 8 

100394 Hs56052 D84276 CD3Barrtkjen(p45) 8 

114132 HS54192 Z38688 ESTs 75 

15 116786 Hs501527 H25836 tumor necrosis factor (Cgand) supertami 7.7 

106579 HS53023 AA456135 ESTs 75 

128780 Hs.105700 AA291725 SBOBtedftizzlad-related protein 4 75 

114965 Hs.72472 AA250737 ESTs 7.4 

112033 HS52627 R43162 ESTs 7.1 

20 102398 U42359 Human N33 protein form 1 (N33) gene, exo 7 

101201 Hs5256 122524 matrix nwtatoprotalnasa 7 (maWysin; 65 

109272 HS588462 AA195718 ESTs 65 

103145 Hs.169849 X66276 mycsto-hlnding protein C; slow-type 6.9 

101803 Hs.155691 M86546 pre-B-ceH leukemia transcription factor 6.8 

25 120562 Hs5Q2267 AA280036 ESTs; Weakly similar to W01 A6« [Celega 6.8 

109112 HS257924 AA169379 ESTs 6.8 

109795 HS526416 F10707 ESTs 6.7 

107532 Hs.173684 Z19643 ESTs; Weakly similar to (defllne not ava 6.7 

130336 H&171995 X07730 kaitikrein 3; (prostate specific antigen 6.6 

30 131425 HS56691 AA219134 ESTs 6.6 

120588 Hs.16193 AA281591 Homo sapiens mRNA; eDNA DKFZp586B21 1 (fr 6.6 

132902 Hs59838 AA490969 ESTs 65 

125674 Hs523378 W28078 H.sapiens mRNA lor transmembrane protein 6.6 

133724 Hs.75746 U07919 aldehyde dehydrogenase 6 65 

35 130343 Hs578628 AA490262 ESTs; Moderately similar to APXL gene pr 65 

120215 Hs.108787 Z41050 Homo sapiens Mcd4p homotag mRNA; complet 65 

129215 Hs.126085 AA176867 ESTs 65 

131881 Hs5383 AA010163 upstream regulatory element binding prot 65 

133376 Hs.7232 T23670 ESTs 6.4 

40 105376 Hs.8768 AA236559 ESTs; Weakly similar to neuronal thread 6.4 

104674 HS56289 AA009527 ESTs 6.4 

100727 Hs.334786 X07290 Human HF.12 gene mRNA 65 

130150 Hs.15113 AF000573 hom>gentisate1^K>xyger^(hornogenti 65 

121770 Hs.278428 AA421714 Homo sapiens mRNA lor K1AA0896 protein; 65 

45 123475 Hs550528 AA599267 ESTs; Weakly similar to ANKYHIN; BRAIN V 65 

133061 HS596638 AB000584 prostate differanSarton factor 65 

116429 Hs579923 AA609710 ESTs; Weakly similar to similar to GTP-b 65 

101233 Ks578 129008 sorbitol dehydrogenase 65 

104691 Hs57744 AA011176 ESTs 65 

50 127248 AA325029 EST27953 Cerebeflum II Homo sapiens cONA 65 

127775 Hs.179902 H04106 ESTs; Weakly similar to (define not ava 65 

105500 HS522399 AA256485 ESTs 6.1 

131463 Hs.2714 X74142 forkhead (Drosophila)-Iike 1 - 6.1 

132116 Hs.40289 AA234767 ESTs 6 

55 130828 HS503213 AA053400 ESTs 5.9 

115357 Hs.72988 AA281793 ESTs 55 

105496 HS501997 AA256323 ESTs 5.7 

116334 Hs.48948 AA491457 ESTs 5.7 

107968 Hs.61539 AA034020 ESTs 5.7 

60 120132 Hs.125019 238839 ESTs; Weakly similar to ID1 ALU SUBFAMI 5.6 

106375 Hs589072 AA443993 ESTs 5.6 

132550 Hs.170195 AA029597 bone morphogenetic protein 7 (osteogenic 55 

124777 Hs.140237 R41933 ESTs; Weakly similar to neuronal thread 5.6 

100311 Hs537616 D50640 phosphodiesterase 38; cGMP-irthibited 5.6 

65 101791 Hs.62354 M83822 Human beige-like protein (BGL) mRNA; par 55 

117698 Hs.45107 N41002 ESTs 55 

132387 Hs581434 R70914 heat shock 70kD protein 1 55 

122041 Hs58732 AA431407 Homo sapiens Chromosome 16 BAG done CIT 55 

133723 Hs562476 AA088851 s-adencsylmelhlonine dacarboxylasa 1 55 
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113938 W81598 ESTs S.4 

133015 H&24631S AA047036 ESTs 5.4 

125745 Hs.75722 AI283493 ribophoiinll 54 

107295 H&80120 T34527 UDP-N-at»tyl*lpha-(>galaetosam!n6:polyp 5-4 

5 108186 Hs.7780 AA058482 ESTs 65 

100184 H&21223 017408 calponlnl; bask; smooth muscle S3 

1044S6 Hs&6392 N25110 Human guanine nucleotide exchange factor 5.3 

104033 H&9S944 AA365031 ESTs S3 

110844 Hs.167531 N31952 ESTs; Weakly similar to (deffine not ava S3 

10 129056 Hs.108338 H70827 ESTs; Weakly similar to Dtl ALU SU8FAM1 S3 

102805 H&25351 U90304 iroquols-dass homeodomain protein S3 

133493 Hs.194369 AA284143 Homo sapiens chromosome 1 atrophh-1 rel S3 

129184 Hs.109201 W26769 ESTs; Highly similar to (deffine not ava 5.2 

134158 Hs.78428 U15174 BCL2/adenovkus E1B 19kCHnteracling pro S2 

IS 107240 Hs.159872 059368 ESTs S2 

104787 AA027317 ESTs; Weakly sknOar to UU ALU SUBFAMI 5:2 

123527 Hs.108327 AA608679 damage-specific DNA binding protein 1 (1 52 

116646 Hs.194228 F03048 ESTs; Moderately similar to till ALU SUB 52 

101448 Hs.195850 M21389 keratin 5 (epidermolysis bullosa simplex 5.1 

20 116188 Hs.184598 AA464728 ESTs; Weakly stater to Ull ALU SUBFAMI 5.1 

126259 H&281428 Z21472 ESTs; Moderately similar to W ALU SUB 5.1 

105921 Hs.169119 AA402613 ESTs 5.1 

103375 H&54416 X91868 sine oculis homeobox (DrosophBa) homolo 5.1 

128871 Hs.106778 AA4O0271 ESTs; Highly slrriar to (deffine not ava 5.1 

25 112681 Hs.148932 R87331 ESTs; Moderately similar to semaphorin V 5.1 

105784 H&226434 AA350771 ESTs 5.1 

116238 Hs.47144 AA479362 ESTs 5 

102913 I&80342 X07696 kerafinIS 5 

103011 Hs326035 X52541 early growth response 1 5 

30 126023 H58881 yr36d09.r1 Scares fetal liver spleen INF 5 

103709 Hs.13804 AA037316 ESTs 5 

118981 Hs.39288 N93839 ESTs; Weakly similar to 111! ALU SUBFAMI 5 

134807 Hs.89732 X78932 zinc finger protein 273 5 

100079 H&23311 AB002365 Human mRNA tor KIAA0357 gene; partial cd 4.9 

35 132047 H&3796 083492 EphB6 4.9 

132880 Hs.177537 AA444369 ESTs 4.9 

124049 Hs.74519 F10523 primase; polypeptide 2A (53kD) 43 

133330 Hs.71119 U42360 Human N33 mRNA; complete cds 4.8 

104776 AA026349 ESTs 4.8 

40 122593 Hs.128749 AA453310 Homo sapiens aipha-metrr/iacyl-CoA racema 4.8 

103912 Hs.143087 AA251078 ESTs 4.8 

113961 Hs.26009 W86307 Homo sapiens mRNA for KIAA0860 protein; 4.8 

105288 Hs.3585 AA233168 ESTs; WeaJdy similar to coded for by C. 4.8 

135035 H&284186 H89575 ESTs 4.8 

45 104144 Hs.183390 AA447439 ESTsWeakrysMa/toZlNCHNGERPROT 4.8 

129389 Hs.288126 AA621604 ESTs 4.8 

125982 R98091 RAE1 (RNA export 1 ; S.pombe) homolog A3 

125162 HS26243 W44682 ESTs 43 

103023 Hs.1 17950 X53793 muttauRcfional pcdypepfbia similar to S 4.7 

50 129735 W80701 ESTs; Weakly similar to HERV-E envelope 4.7 

104479 Hs.106390 N36040 ESTs 4.7 

103731 AA070545 zm7c3^StratageneneuroepMion(«93 4.7 

126575 Hs.127602 W72416 ESTs - 4.7 

124578 Hs.231500 N66321 Human glucose transporter-eke protato-l 4.7 

55 130617 Hs.1674 M90516 giutamme-fmctc«e^hosphatetransamin 4.7 

116752 Hs.91622 H06373 Homo sapiens done 24456 mRNA sequence 4.7 

100279 HS&007 D42084 Human mRNA for K1AA0094 gene; partial cd 4.7 

126288 Hsa9576 AI479264 ESTs 4.7 

131836 H&32990 AA610086 . ESTs 4.7 

60 106717 Hs239489 AA465093 TTA1 cytotoxic granule-associated RNA-bi 4.7 

114542 Hs51011 AA055768 ESTs 43 

103806 AA130614 zotf2j1 Stratagena neuroepitheSum NT2R 4B 

130529 AA173238 small inducible cytokine A5 (RANTES) 4.6 

115675 HS&065 AA406546 ESTs 43 

65 111386 H&293798 N95326 ESTs 4.6 

106503 Hs.28679 AA452411 ESTs 43 

119943 Ks.14158 W86835 copinelll 4.6 

104459 Hs.100070 M91493 EST 43 

100774 Hs.89603 HG371-HT1063 Mucin 1, Epithelial, AIL Splice 6 4.6 
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I006S2 Hs.142653 HG2825W2949 
132015 Hs3731 D118D0 
26086 H70975 
130888 Hs.173094 F03819 
I06330 H&20166 AA446864 
i AA199853 
131584 H&29117 X91648 
104838 Hs20953 AA039481 
125661 R50319 
103171 Hs234726 X68733 
03928 Hs.199160 AA280085 
102899 Hs.75730 X06272 
100892 Hs.180789 HG4557-HT4362 
106167 Hs.7956 AA425906 
129404 H&317584 AA172056 
106990 H&24758 AAS21354 
132316 K&44S68 U28S31 
132056 H&38176 T89386 
133718 Hs.198760 X15306 
101470 Hs.1846 M22898 
131904 H&284296 AA143019 
1 05804 HS22514 AA383142 
122861 Hs.119394 AA464428 

I33S H&29894 N79565 
_J944 H&98518 AA429278 
134401 Hs511577 AA243746 
126458 H&2B8969 AA815252 
133435 H&32396S T23983 
105178 HS21941 AA187490 
127315 AA640834 
I32645 Hs54424 X87870 
16162 H&282S90 AA461487 
18040 H&47567 N52876 
1 30008 H&278427 M31423 
126607 Hs.114688 W87424 
123061 Hs.105130 AA482030 
109391 Hs.184245 AA219699 
09175 AA180496 
27003 Hs.173540 AA550806 
102547 H&46638 U57S11 
134208 Hs.79993 U88871 
104258 HS5462 AF007216 
30759 H&18946 AA094720 
132160 H&295923 AA281770 
135062 HS53872 AA174183 
126510 Hs334762 R49702 
122055 Hs.98747 AA431732 
133136 Hs.6574 AF007165 
109890 H&20843 H04649 
133294 Hs.69997 R79723 
134436 H&83190 S80437 
07375 H&251064 U88573 
122223 K&27413 AA436158 
I03044 Ha248210 X55777 
120125 Hs£9815 W99362 
I28969 Hs£83978 T65327 
I29637 Hs.1179 D90359 
106566 AA455921 
12605 H&29852 R79220 
103364 Hs.279929 X90872 
132811 H&57419 U25435 
126570 HSJ26292 T79274 
16298 Hs.94109 AA489046 
03024 Hs.105938 X53961 
129133 Hs.108850 R56728 
133167 Hs.6641 N98707 
126871 Hs.14051 AA351779 
132333 Hs.45032 AA192157 
107376 Hs327179 U90545 
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ESTs 
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ESTs; Highly shuTar to CGI protein pis 
ESTs; Weakly similar to fin ALU SUBFAMI 
ESTs; Moderately similar to (ill ALU SUB 
ESTs 

nr27b06j1 NCI_CGAP_Pr3 Homo sapiens cDN 
Rsapiens mRNA for hepatocyte nuclear fa 
ESTs; Weakly similar to F52C123 [Geleg 
EST 
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EST 
ESTs 
ESTs 

ESTs; Weakly similar to (defDne not ava 
chromosome 11 open reading frame 3 



solute carrier famly 4; sodium bicarbon 
ESTs; Weakly staBar to (defDne not ava 
seven in absentia (DrosophSa) homolog 1 
ESTs 

ESTs; Weakly similar to K1AA0319 [H^api 
EST 



ESTs 

H^aplens mRNA for transBn associated z 
fatty acid synthase {3 1 region} [human, 
NBR2 
ESTs 

H .sapiens Mahlavu hepatocellular carcino 
EST 

ESTs; Highly similar to (deffine not ava 
TATA box binding protein (TBPHtssodate 
ESTs; Weakly similar to I1H ALU SUBFAMI 
ESTs 

Ksapiens mRNA for gp25L2 protein 
transcriptional repressor 
ESTs 
ESTs 

lactotronsfemn 

yg95c6.r1 Soares infant brain 1NIB Homo 

Mnesin family member 5C 

ESTs 

ESTs 

solute carrier family 17 (sodium phospha 



4.6 
4.6 
4.6 
4.6 
4.6 
4.5 
4.5 
4.5 
4.5 
4JS 
4£ 
45 
4S 
4£ 
4JS 
4S 
4.4 
4.4 
4.4 
4.4 
4.4 
4.4 
4.4 
4A 
4.4 
4.4 
4.4 
4.4 
43 
43 
4.3 
4.3 
42 
4.3 
43 
43 
4.3 
43 
43 
43 
43 
43 
43 
43 
43 
41 
42 
42. 
42 
42 
42 
42 
. 42 
42 
42 
42 
42 
42 
42 
42 
42 
42 
42 
4.1 
4.1 
4.1 
4.1 
4.1 
4.1 



103 



WO 02/30268 



PCT/US01/32045 



128517 Hs.100861 AA280617 ESTs; Weakly similar to p60katanln[as 4.1 

130555 Hs.1 16774 AA450324 ESTs 4.1 

105765 H&24183 AA343514 ESTs 4.1 

126S29 H&26369 AA133237 ESTs 4.1 

S 125928 Hs.181889 H28730 ESTs 4.1 

117280 Hs.172129 N22107 ESTs; Moderately similar to 110 ALU SUB 4.1 

100234 H&3085 D2S677 K1AAD054 gene product 4.1 

100959 Hs.1 18127 J00073 actin; alpha; cardiac muscle 4.1 

107130 H&12913 AA620582 ESTs; Weakly similar to {defline not ava 4.1 

10 105035 Hs.8859 AA128486 ESTs 4.1 

126735 H&226795 AA808949 glutathione S-transtarase pi 4.1 

113056 H&8036 T26471 ESTs; Moderately similar to lill ALU SUB 4 

102460 H&211582 U48959 Honw sapiens myosin light chain tease ( 4 

106868 H&26813 AA504631 ESTs; Weakly similar to (defline not ava 4 

15 123107 Hs.104207 AA486071 ESTs 4 

127256 H&267967 AA327550 ESTs; Wealdy similar to ilil ALU SUBFAMI 4 

105329 HS22862 AA234561 ESTs 4 

115504 Hs.42736 AA291948 ESTs 4 

120726 Hs57293 AA293656 ESTs 4 

20 103576 H&94560 Z26317 desmogleln2 4 

127889 Hs.144941 AI147408 ESTs 4 

106394 Hs.25320 AA447223 ESTs 4 

128046 AA873285 ESTs 4 

103391 Hs.1 14366 X94453 pyrrotae*carboxylate synthetase (glut 4 

25 106448 Hs.27004 AA449455 ESTs 4 

126513 Hs.86276 W27601 ESTs; Moderately similar to (define not 4 

129593 HS58314 AA487015 ESTs; Weakly similar to 111! ALU SUBFAMI 35 

110151 HS5160B H18836 ESTs 35 

105344 H&8645 AA235303 ESTs 3.9 

30 104791 H&501871 AA029046 ESTs 35 

123442 Hs.1 11496 AAS98803 ESTs 3.9 

127800 Hs.79428 AA521047 BCL2/adanowus E1B 19kD-interacting pro 3.9 

114555 Hs.167904 AA058594 ESTs 35 

122138 Hs.163960 AA4355 49 ESTs 35 

35 129565 Hs.1 98726 X77777 vasoactive Intestinal pepfide receptor 1 3.9 

103471 Hs.75216 Y00815 protein tyrosine phosphatase; receptor t 3.9 

133908 Hs525474 M83216 caldesmonl 3.9 

105635 Hs501985 AA281508 ESTs 35 

134285 Hs51086 AA460012 solute carrier (amBy 22 (organic cation 35 

40 134125 H&50421 R38102 WAAQ203 gene product 35 

125628 H&241493 AA4180S9 natural kfflar-tumor recognition sequenc 35 

103695 Hs.186600 AA018758 ESTs 35 

100642 Hs.182183 HG2743HT3926 Caldesmonl, AH Splice 6, rton-Muscle 35 

104334 Hs.78771 D82614 ESTs 35 

45 110242 Hs.19978 H26417 ESTs 3.9 

125298 HS289008 Z39255 ESTs 35 

104060 Hs503193 AA3979S8 zt87a9j1 Soares_tastis_NHT Homo sapiens 35 

105323 H&293960 AA398197 ESTs 35 

126499 Hs.110445 AA315671 ESTs; Moderately similar to unknown (M jn 35 

50 130752 Hs.18895 050927 KIAA0137 gene product 3.8 

123494 Hs.112110 AA599786 ESTs 3.8 

104346 Hs52478 AA040154 ESTs 35 

108921 Hs.71721 AA142913 ESTs - 35 

115506 Hs.45207 AA292537 ESTs 3.8 

55 100452 Hs241552 D87742 Human mRNA for KIAA0268 gens; partial cd 35 

104454 Hs.129228 M84443 aabctoWnase 2 3.8 

108730 Hs.102859 AA126254 ESTs 35 

131223 Hs54427 AA247788 ESTs; Highly similar to (deffine not ava 3.8 

104784 Hs.269228 AA027055 ESTs 3.8 

60 104946 Hs.73848 AA069549 ESTs 35 

106932 Hs5394 AA495926 ESTs 35 

101724 Hs.620 M69225 bullous pemphigoid antigen 1 (230/2401(0) 35 

106140 Hs.14912 AA424524 Homo sapiens mRNA for KIAAQ286 gene; par 35 

128135 Hs269721 AA913491 ESTs 35 

65 120030 Hs£8S94 W92051 ESTs 35 

126457 Hs503B2 AA007489 zh98g04j1 Soares_feta|Jver_spleen_1NF 35 

123917 Hs.1 12969 AA621311 EST 3.7 

110714 Hs.17752 H95978 Homo sapiens phosphatidylserfne-specific 3.7 

130577 Hs.162 M35410 insuWke growth factor blndirtgprote 3.7 
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117667 H&44708 N39214 ser-Uir protein Mnaso related to Bie my 3.7 

126104 Hs59712 N77278 ESTs; WoaMy similar to BONE/CARTILAGE P 3.7 

100379 H&27B721 D82060 Homo sapiens mRNA for membrane protein w 3.7 

115646 H&305971 AA404352 ESTs 3.7 

5 125792 Hs.183700 A1005388 ESTs; Moderately stellar to Ml ALU SUB 3.7. 

102162 Hs.1592 U18291 CDC16 (cell division cycle 16; S. cerevt 3.7 

128530 Hs.183475 AA504343 ESTsjModeratelyslrnilarto till ALU SUB 3.7 

119940 Ha272531 W86779 EST 3.7 

110769 H&23837 N22222 yw34b06^1 Morton Fetal Cochlea Homo sap 3.7 

10 132914 Hs.60293 AA496037 ESTs 3.7 

113594 Hs.15683 T92030 ESTs 3.7 

103702 HaZ79952 AA027793 ESTs; Highly similar to (define not ava 3.7 

130780 Hs.19347 AA248406 ESTs 3.7 

123288 H&291025 AA495B36 EST 3.7 

15 120691 H&22380 AA291173 ESTs 3.7 

103153 Hs.75295 X66534 guanybte cyclase 1; soluble; alpha 3 3.7 

129201 Hs.109390 H19989 ESTs 3.7 

114798 Hs54900 AA159181 ESTs 3.7 

126801 Hs.7337 AA512902 ESTs 3.7 

20 105503 Hs51707 AA256616 ESTs 3.7 

104260 Hs.194283 AF008192 Homo sapiens putative GR6 protein (GR6) 3.7 

125980 Hs.35699 R97219 ESTs 3.7 

123255 Hs.105273 AA490890 ESTs 3.6 

103862 Ks.6363 AA206625 ESTs 3.6 

25 100696 Hs.121688 HG3162-HT3339 Transcription Factor lia 3.6 

134917 Hs.166994 X87241 FAT tumor suppressor (DrosophSa) homoto 3.6 

103520 Y10511 KsaplensmRNA for C0176 protein 3.6 

113778 Hs.302738 W15263 ESTs 3.6 

101838 Hs.75511 M92934 connective tissue growth (actor 3.6 

30 113702 T97307 ESTs; Moderately similar to IIB ALU SUB 3.6 

118201 Hs.48428 N59800 EST 3.6 

116519 Hs.68554 C20780 EST 3.6 

105886 Hsi2983 AA400517 ESTs; Moderately stailar to UDP-GLUCOSE: 3.6 

106709 Hs.170291 AA464696 ESTs 3.6 

35 127858 Hs27973 AA806365 oc26h075l NCLCGAP_GCB1 Homo sapiens cD 3.6 

101964 S81578 dioxin-resoonsiva gene {putative potyade 3.6 

105508 Hs526416 AA256680 ESTs 3.6 

116844 Hs.337434 H64938 ESTs 3-6 

105372 Hs.142286 AA236481 ESTs 3.6 

40 100745 Hs.144630 HG3510+1T3704 V-Erba Related Ear-3 Protein 3.6 

127521 Hs.164018 AA809982 ESTs 35 

110758 HS274265 N21365 talin 3.6 

107307 Hs.44155 T52099 creatine kinase; mitochondrial 2 (sarcom 3.6 

133200 Hs.183639 AA432248 ESTs 3.6 

45 114774 Hs.184325 AA150043 ESTs 3£ 

120265 H&270696 AA173759 ESTs; Moderately staflar to HU ALU SUB 35 

134359 Hs.199067 M34309 v-erb-b2 avian eryftroblastic leukemia v 3.6 

116250 H&44829 AA480975 ESTs; Moderately stailar to till ALU SUB 3.6 

106313 H&35841 AA436459 nuclear factor 1/X (CCAAT-blnding transc 3.6 

50 131898 H&279780 N52232 ESTs 35 

133444 Hs.73793 M27281 vascular endothelial growth factor 3.6 

128232 HsJ34641 H06296 ESTs 3-6 

135357 Hs.79572 AA235803 ESTs - 35 

457951 AI369384 arylsulfatase D 35 

55 108407 AA075519 zm87h9.s1 Stratagene ovarian cancer («93 35 

126659 T16245 aoTsintegrinandmetailoprotelnasedoma 35 

104189 Hs.301804 AA485B05 ESTs 35 

125956 Hs.129014 N53276 ESTs 3-5 

103026 Hs.79386 X54162 Human mRNA lor a 64 Kd autoantigen expre 35 

60 133011 Hs.171921 AA042990 seinadcmialn; immunoglobulin domain (Ig); 35 
131379 Hi26176 R49035 ESTs 35 

126742 Hs.169359 H8410S yr57e06j1 Scares fetal frver spleen INF 35 
105560 Hs506915 AA262783 ESTs 35 
118472 Hs.42179 N66818 ESTs 35 

65 105623 Hs50127 AA280B95 ESTs; Highly similar to t!fl ALU SUBFAMI 35 
120262 Hs.145807 AA172076 ESTs; Moderately similar to I1U ALU SUB 35 
105027 H*26771 AA126472 ESTs 35 
130760 Hs.18953 AA126997 phosphodiesterase 9A 35 
117473 Hs.155560 N30157 ESTs 35 
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102663 Hs.168075 U70322 karyopherin (Irnporfin) beta 2 15 

126349 Hs.13531 AM42868 ESTs; Weakly similar to (defGne not ava 35 

132154 H&41119 N67179 ESTs 15 

131689 Hs50696 AAS9S653 transcription factor-Ska 5 (bade helix 35 

5 127862 Hs.163191 AA765305 EST 35 

126995 Hs.189810 W26950 Human DNA sequence from PAC388M5 on chr 35 

119071 R31180 ESTs 35 

103941 Hs.96593 AA282978 ESTs 35 

110721 Hs51319 K97678 ESTs 35 

10 '126586 Hs.43036 AA011247 ESTs 35 

103106 Hs.1857 X62025 phosphodiesterase 6G; cGMP-specific; rod 35 

116357 Hs50797 AA504806 Homo sapiens done 23620 mRNA sequence 35 

105309 Hs.4104 AA233790 ESTs 35 

130796 Hs.19525 R39390 ESTs 35 

IS 109101 HS52184 AA167708 ESTs 35 

103134 H&2839 X65724 Nome disease (pseudogttoma) 35 

131798 HS501449 X86098 adenovirus 5 ElAbinding protein 35 

118535 Hs.49418 N67968 ESTs 35 

102592 Hs.11223 U62389 Human putative cytosolic NADP-dapendant 14 

20 125905 Hs.6456 T69868 chaparonm containing TCP1;subuntt 2 (b 3.4 

109160 H&301997 AA179387 ESTs 14 

105327 H&211593 AA234440 ESTs 14 

106588 Hs57787 AA456598 ESTs 14 

122635 AA454085 EST 14 

25 132413 H&260116 AA132969 metaHoproteasa 1 (ptinlysln (amily) 14 

131938 Hs54956 AA283620 ESTs 14 

133871 Hs.182793 AA454597 ESTs 3.4 

107175 H&292503 AA621751 ESTs; Weakly similar to KIAA0601 protein 14 

101188 Hs.184298 L20320 cycCn-depenrJant kinase 7 (homolog of Xa 3.4 

30 126422 Hs237658 H48518 ESTs;H^r/slmilaj:toapoliTOroteInA 14 

118475 N66845 ESTs; Weakly similar to HI! ALU CLASS B 14 

104558 Hs.88959 R56678 ESTs; Weakly similar to 1U1 ALU SUBFAMI 3.4 

128307 Hs.132005 AI453794 ESTs 3.4 

112254 HS25829 R51831 ESTs 3.4 

35 125408 HS59578 N72353 yv37e12j1 Soares fetal liver spleen INF 3.4 

109834 Hs.175955 K00S04 ESTs 3.4 

130844 Hs£0191 D12122 seven In absentia (Drosophila) homolog 2 3.4 

127143 HS20843 AA533553 nj68h04.s1 NCLCGAP_Pr10 Homo sapiens cO 3.4 

135309 Hs.42500 025984 ESTs 3.4 

40 125724 Hs.295978 AA083407 stimulated trans-acfing factor (50 kDa) 14 

127692 Hs.187983 AI021912 ESTs 3.4 

116674 Hs.92127 F04816 ESTs 3.4 

134700 Hs5868 AA481414 gokji SNAP receptor complex member 1 3.4 

114846 Hs.166196 AA234929 ESTs 3.4 

45 103649 Hs.155983 Z70219 Rsaplens mRNA for 5DTR for unknown pro 3.4 

134835 Hs59925 L04569 calcium channel; voltage-dependent; L ty 3.4 

130568 Hs.16085 AA232535 ESTs; Highly similar to (defCne not ava 3.4 

111331 Hs.15978 N78773 ESTs 14 

106036 Ha.10653 AA412505 ESTs 3.4 

50 130987 HS21893 R45698 ESTs 3.4 

112814 H335828 R98192 ESTs 3.4 

127815 Ha255015 AA876009 ob93c1 0^1 NCLCGAPjGCSf Homo sapiens cD 3.4 

100144 Hs.75616 013543 K1AA0018 gene product - 3.4 

101129 Hs247992 L10405 Homo sapiens DNA binding protein for sur 3.4 

55 130874 HS20621 T08287 ESTs 3.4 

106882 Hs.26994 AA489009 ESTs 3.4 

103855 Hs502267 AA195179 ESTs 3.4 

125957 H45213 yo03b08/1 Soares adult brain N2D5H855Y 35 

114048 Hs.146085 W94613 ESTs 13 

60 109826 Hs.75354 F13702 ESTs 35 

125355 Hs.170098 R45630 ESTs; Highly similar to K1AAD372 [H.sapl 35 

104182 Hs.143792 AA479990 ESTs; Weakly similar to glioma amplified 35 

100294 Hs.75454 049396 Human mRMA for Apo1_Human (MER5(Aop1 -Mou 35 

131688 Hs50692 U24153 p21 (CDKNtA^cfivated kinase 2 35 

65 116256 Hs58201 AA481256 ESTs; Weakly similar to (defbie not ava 35 

102034 Hs230 U05291 fibromoduEn 35 

130072 Hs.14658 R99606 Human chromosome 5ql3.1 done 568 mRNA 13 

114615 Hs.159456 AA083812 ESTs; Highly similar to (defSne not ava 13 

128707 Hs.104105 AA136474 Meis (mouse) homolog 2 35 
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115048 Hs.130057 AA252668 ESTs 34 

125862 HS41110 H12084 ESTs 34 

135142 HS44192 R31679 ESTs 34 

103119 Hs4877 X63629 eadhertn 3; P-cadherin (placental) 3.3 

5 104460 Hs.62604 M91504 . ESTs 34 

100365 Hs.79284 D78611 mesoderm pacific transcript (mouse) horn 34 

131524 Hs401804 N39152 ESTs 33 

102165 Hs.159627 U18321 Death associated protein 3 34 

126966 Hs.182575 R38438 solute carrier family 15 (H+VpepBde tra 34 

10 124839 Hs.140942 R55784 ESTs 34 

100709 Hs.100469 HG3264-HT3441 Af-6 (Gbil02478) 34 



132S67 Hs.61635 AAD32221 Homo sapiens BAC done RG041 D1 1 from 7q2 34 

102927 HS45114 X12876 keratin 18 34 

132616 H&2B3558 AA386264 ESTs 34 

15 125132 Hs.129781 W15495 ESTs 34 

111225 H941652 N68989 ESTs 34 

114956 HS47113 AA243681 ESTs 34 

122235 Hs.112227 AA436475 ESTs 34 

112325 Hs.12315 R56055 ESTs 34 

20 123360 Hs.178604 AA504784 ESTs 34 

105150 Hs.155995 AA169640 Homo sapiens mRNA tor KIAA0643 protein; 34 

107391 H3J284294 W02877 ESTs 34 

113058 Hs.7569 T26893 EST 34 

134371 HS42318 S69790 Brush-1 34 

25 125669 HS433256 R51308 ESTs; Moderately similar to Ifll ALU SUB 34 

111506 Hs494105 R07726 ESTs 34 

122974 Hs.194215 AA478625 ESTs 34 

102369 Hs499867 U39840 hepatocyta nuclear factor 3; alpha 34 

120408 Hs.190151 AA235045 ESTs 34 

30 117993 Hs.47402 N52039 ESTs; Weakly similar to Oil ALU SUBFAMI 34 

129586 Hs.11500 AA437118 ESTs 34 

128138 Hs.126494 A1200825 ESTs 34 

127265 AA332751 EST37214 Embryo, 8 week I Homo sapiens c 34 

107674 Hs.41143 AA011027 Homo sapiens mRNA for KIAA0581 protein; 34 

35 104866 Hs493691 AA045342 ESTs 34 

103427 Hs450655 X97303 H.sapiens mRNA for Ptg-12 protein 32 

132990 HS434334 AA458761 ESTs 34 

127017 Hs451946 AA740146 ESTs 34 

132313 Hs.44481 U13220 forkhead (DrosophllaHike 6 34 

40 106880 HS42425 AA488889 ESTs 34 

107039 Ha 169780 AA599751 homologous to yeast nitrogen permease (c 34 

120870 Hs492581 AA357172 ESTs 34 

107920 H&284207 AA027951 ESTs 34 

104165 Hs.105116 AA459160 EST 34 

45 107012 Hs.63908 AA598745 ESTs 34 

103605 Hs.194657 Z3S402 Rsapiens gena encoding E-cadherin, exon 34 

124006 Hs470016 080302 ESTs 34 

101300 Hs.74137 L40391 Homo sapiens (clone s153) mRNA fragment 34 

101183 Hs.795 L19779 H2A histona family; member 0 34 

50 125596 R25698 yg44h1 1 j2 Soares infant brain 1 NIB Homo 34 

127261 AA661567 nu86b02.s1 NCLCGAP_Ahr1 Homo sapiens cD 34 

120090 Hs49554 W94591 ESTs 34 

129393 Hs.166982 D13435 phosphatidyttnositol gtycan; class F ■ 34 

120923 Hs.97129 AA382283 ESTs 34 

55 118907 HS474256 N91003 ESTs 34 

111552 Hs.191185 R09411 ESTs 34 

104431 Hs49913 J03019 adrenergic; beta-1-; receptor 34 

133551 Hs478634 D63480 Human mRNA for KIAA0146 gena; partial cd 34 

131615 Hs.192803 014533 xeroderma pigmentosum; complementation g 34 

60 126547 Hs44072 U47732 trajisrnen*rane4superfamllyrnernber3 34 

103172 Hs.1 16774 X68742 integrln; alpha 1 34 

113867 Hs44095 W68845 ESTs 34 

133323 Hs.70937 Z83735 H3 htstone family; member K 34 

111597 Hs.189716 R11499 ESTs 34 

65 121515 Hs.104698 AA412133 ESTs 34 

107445 Hs.6639 W28406 ESTs 34 

106887 Hs434335 AA489091 ESTs 34 

123052 Hs.185766 AA481806 ESTs 34 

107072 Hs.130760 AA6091 13 Homo sapiens mRNA; cDNA DKFZp586N0318 (f 34 
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102214 H&32964 U23752 SRY (sexKietermlnlng region YHkk 11 32 

123147 AA4S7961 ab11h6.s1 Stratagene king (S93721) Homo 32 

125435 Ha272138 R00940 y887gD3.fi Soares fetal liver spleen INF 32 

116246 H&250646 AA478361 ESTs; Highly similar to ubkjuiert-oonjug 32 

5 105169 Hs.180789 AA180321 Homo sapiens (dona S164)mRNA; 3 end o 32 

134001 Hs.78344 AF001548 myosin; heavy polypeptide 11; smooth mus 3J2 

124868 H&304389 R68571 ESTs 32 

133205 Hsj67619 AA089559 Homo sapiens mRNA; chromosome I specific 32 

102986 Hs.182378 X17648 colony stimulating factor 1 (macrophage) 32 

10 101232 H&242894 128997 ADP-ribosytation factor-tike 1 3.1 

132906 Hs234898 AA142857 ESTs; Highly similar to gemWn [Rsapie 3.1 

104281 Hsf669 C14290 ESTs 3.1 

123926 H&227933 AA621348 ESTs; Highly similar to (deflina not ava 3.1 

134464 Hs239720 N79354 ESTs; Wealdy similar to Rga(Djnelanogas 3.1 

IS 105322 Hs.16346 AA234100 ESTs 3.1 

100631 Hs/48332 HG2709WT2805 Serine/Ttoecfiira Kinase (Gb225431 ) 3.1 

130791 Hs.199263 AA259102 ESTs; Highly similar to (define not ava 3.1 

131220 Hs300855 R77200 ESTs 3.1 

113237 Hs.123642 T62857 ESTs 3.1 

20 125562 H&98968 AJ494372 ESTs 3.1 

134110 Hs.79136 U41060 Human breast cancer; estrogen regulated 3.1 

132393 Hs.47334 W85888 ESTs; Moderately stmflar to ifll ALU SUB 3.1 

107439 H&296842 W27995 ESTs; Moderately similar to non-musds m 3.1 

125863 H&40719 AA2990S8 Homo sapiens mRNA; cDNA DKFZp564M091 6 (f 3.1 

25 105811 Hs286192 AA394121 ESTs 3.1 

129284 H&296141 AA104023 ESTs 3.1 

125321 Hs.178294 T86652 ESTs 3.1 

107332 Hs.183297 T87750 ESTs 3.1 

123570 Hs.109653 AA608955 ESTs 3.1 

30 100384 HsS0800 083646 matrix metailoprateinase 16 (merabrane-in 3.1 

109063 Hs38972 AA161043 tetraspanl 3.1 

133284 Hs.182828 U09367 zinc linger protein 136 (done pHZ-20) 3.1 

131839 Hs53010 H80622 Homo sapiens mRNA tor K1AAD633 protein; 3.1 

117606 Hs/4698 N35115 ESTs 3.1 

35 418998 Hs287849 F13215 ESTs 3.1 

125180 Hs.103120 W58344 ESTs 3.1 

100789 HG3893-HT4163 Phosphoglucomulase 1, At Splice 3.1 

126017 Hs.159440 H60487 ESTs 3.1 

132452 H&247324 AA005262 Homo sapiens DNA sequence from PAC 262D1 3.1 

40 129077 Hs.103479 H78886 ESTs 3.1 

126563 Hs.181368 W26247 U5 snRNP-spedfic protein (220 kO); orth 3.1 

129650 Hs.118258 N52554 ESTs 3-1 

123465 AA599033 ESTs 3.1 

126436 Hs.152316 AA345339 EST51 345 Gall bladder II Homo sapiens cO 3.1 

45 126460 Hs.167031 W01616 za36d05j1 Soares fetal fiver spleen INF 3.1 

118697 Hs.43234 N72094 ESTs 3.1 

103860 Hs.38057 AA203742 ESTs 3.1 

127968 Hs.124347 AA971439 ESTs 11 

124984 H&223241 T47566 yb15c11.s1 Stratagene placenta (#937225) 3.1 

50 103903 Hs.15220 AA249334 |312.seq.F Human fetal heart. Lambda ZAP 11 

106697 Hs.22242 AA463737 ESTs 3.1 

130892 Hs.20993 AA442604 ESTs; Weakly similar to Ydr374cp (S.cere 3 

114032 Hs.35014 W92779 ESTs - 3 

128835 Hs.106390 W15528 ESTs 3 

55 103667 HS247815 Z80788 Haptens H4/I gene 3 

126264 Hs250614 N42897 yy13h06j1 Soares melanocyte 2NbHM Homo 3 

132626 H&21275 025755 ESTs 3 

131107 Hs.75354 N87590 ESTs 3 

126780 Hs£811 R12421 ESTs 3 

60 127363 K&22116 AA307744 Homo sapiens Cdc14S 1 phosphatase mRNA; c 3 

103690 HsJ4063 AA016186 ESTs 3 

102589 Hs.8867 U62015 Homo sapiens Cyr61 mRNA, complete cds 3 

125144 Hs24338 W37999 ESTs 3 

132977 Hs.30t404 U28666 RNA binding motif protein 3 3 

65 120714 Hs.146170 AA292689 ESTs 3 

101038 Hs.79411 J05249 repBcation protein A2 (32kD) 3 

102856 Hs248177 X00090 Human hlstoneH3 gene 3 

105516 H&30738 AA257971 ESTs 3 

131137 K&33287 U85193 nudear factor l/B 3 
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127221 H&241551 AI354332 ESTs 3 

411888 H&24104 R26708 ESTs 3 

131684 Hs3068 U26174 granzymeK (serine protease; granzyma 3; 3 

100629 H&21291 HG2706+TT2802 Serine/Threonine Kinase {(3)225428) 3 

5 119944 Hs£8915 W86838 EST 3 

113801 Hs.1 18281 W38418 zinc finger protein 266 3 

133780 Hs.76152 M14219 decorin 3 

104690 Hs.14449 AA010889 ESTs 3 

126371 H&304139 N57645 EST 3 

10 127635 Hs.1 18346 AA766903 ESTs 3 

128434 Hs.143880 AI190914 ESTs 3 

435761 Hs.187555 AA701941 ESTs 3 

125025 H&50748 T71581 ESTs 3 

124940 Hs.103804 R99599 heterogeneous nudearribonudeoproteln 3 

15 128742 HS251531 D00763 proteasoma (prosome; macropain) subunlt; 3 

107147 Hs.10450 AA621125 Homo sapiens chromosome 2; 10 repeat teg 3 

112068 Hs22545 R43910 ESTs 3 

105346 Hs263727 AA2354S5 ESTs; Moderately simBar to llil ALU SUB 3 

130972 H&21739 AA370302 Homo sapiens mRNA; cDNA DKFZp586l1518 (f 3 

20 131230 H&274407 AA149987 thymus specific serine peptidase 3 

133743 Hs.76847 N78435 ESTs 3 

127402 Ha227949 AA358869 ESTs; Highly similar to SEC13-RELATE0 PR 3 

117483 Hs.44189 N30426 ESTs 3 

123859 Hs.112699 AA609368 ESTs 3 

25 103983 H&63290 AA298588 EST114219HSC172ceIlsllHomosaptensc 3 

10379S Hs.7367 AA1 12222 EST s; Moderately simfer to (defline not 3 

115092 K&80975 AA255903 CD39-lite4 2.9 

134831 Hs£9890 S72370 pyruvate carboxylase 25 

128579 Hs.101810 AA093378 ESTs; Weakly similar to DO ALU SUBFAMI 25 

30 134193 Hs.7980 F09570 ESTs 25 

123522 Hs.1 12575 AA608577 ESTs 25 

107109 HS32793 AA609943 ESTs 25 

134694 Hs38556 D50405 histone deacetylase 1 25 

134399 Hs32689 H99801 tumor rejection antigen (gp96) 1 2.9 

35 134632 Hs.174139 AA398710 R sapiens RNA for CLCN3 25 

106683 Hs.14512 AA461495 ESTs 25 

108555 AA084963 zn13e12^1 StratagenehNT neuron (#93723 25 

100953 Hs2110 HG945-HT945 Nucleic Acid-Binding Protein (Gb:L1 2893) 25 

130597 Hs.16492 AA173998 ESTs; Weakly similar to weakly similar t 25 

40 101813 Hs.139226 M87338 replicaffon factor C (activator 1)2 (40 25 

106636 H&288 AA459950 ESTs 25 

129109 Hs.1 08708 AA491295 caidum/calmoduiin-dependent protein Wn 25 

125819 Hs251871 AA044840 stromal cell-derived factor 1 25 

106282 Hs5857 AA433946 ESTs; Weakly similar to (defline not ava 25 

45 100386 Hs.301636 083703 peroxisomal biogenesis factor 6 25 

114548 HS58074 AA056263 ESTs; Moderately similar to Oil ALU SUB 25 

105914 Hs5701 AA402224 Homo sapiens growth arrest and ONA-damag 25 

108552 AA084912 zn11c7.s1 StratagenohNT neuron (#937233 25 

126505 Hs.190057 W26894 16a11 Human retina cONA randomly primed 25 

50 134098 Hs.79086 X06323 Human MRL3 mRNA (or ribosomal protein L3 25 

129721 HS211539 L19161 eukaryotic translation initiation factor 2.9 

100076 Hs377422 AB000897 Homo sapiens mRNA for cadherin RB3, par 25 

117466 Hs.44104 N29862 ESTs - 25 

106335 Hs56688 AA437258 ESTs; Moderately similar to WAP (our-dis 2.9 

55 134510 H&250870 U25265 protein kinase; mkogen-adivated; Unas 2.9 

105835 H&32995 AA398412 ESTs 2.9 
106611 Hs26267 AA458904 ESTs; Weakly similar to torsinA [H.saple 25 

134087 Hs.173824 U51166 thymlne-DNAglycosvtasa 2.9. 
100841 Hs.182183 HG2743-HT2846 Caldesmonl, Alt Splfce 4, NorhMusde 25 

60 104602 R86920 ESTs 2.9 

117203 Hs.42738 H99799 ESTs 25 

131889 HS34073 AA401912 BH-protocadherfn (brain-heart) 2.9 
101707 Hs.155212 M65131 methylmalonyl Coenzyme A mutase 25 

115271 H&5724 AA279422 ESTs 25 

65 125812 HS287912 H73420 lecfin;mannose-binding;1 2.9 

110740 Hs.19762 H99675 ESTs 2-9 

103406 Hs285728 X95677 Rsaplens mRMA for ArgBPIB protein 2.9 

• 104577 Hs.132390 R71539 ESTs 2-9 

102772 Hs.161002 U831 15 absent in melanoma 1 25 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



131710 H&309B5 
12S231 HS268903 
127380 Hs.15535 
104229 Hs.61289 
126600 Hs.191385 
125175 H&303030 
103849 H&34578 
102126 Hs.78961 
124906 Hs.107815 
131148 Hs.303125 
123158 H&218329 
133667 Hs.75462 
105182 Hs.18271 
133968 HS232068 
117425 H&336901 
111087 H&37637 
129641 Hs.11805 
128639 Hs.102897 
133209 Hs.78265 
135154 H&267812 
126838 H&279609 
103603 Hs.106149 
102139 H&2128 ! 
128104 

127834 Hs.337631 
133101 H&180952 
127250 Hs217916 
135063 Hs.93883 
126323 Hs.68644 
121873 H&145696 
122090 Hs.98684 
118728 Hs.322645 
135400 Hs.99915 
125278 Hs.129998 
124387 Hs.109019 
124803 Hs.12188 
H45968 Hs.32149 
104261 HsSm 
105366 Hsi82093 
106070 H&5957 
131356 H&25960 
112009 Hs£6255 
133199 Hs£50175 
110379 H&33130 
103690 Hs.72085 
128152 

107008 H&23740 
135243 HsS7101 
103058 Hs.184510 
132020 H&293845 
116354 Hs.282566 
125867 Hs.12372 
120603 Hs.98541 
115119 H&46847 
133865 Hs.170280 
109415 Hs.1 10826 
128687 H&23767 
109984 Hs.10299 
133179 Hs.66731 
115998 Hs.336629 
112180 HS25067 
120428 Hs.173694 
106241 Hs.6019 
131060 Hs.22564 
111383 Hs.40919 
102123 Hs.1594 
102722 Hs.79981 
129887 HS274324 
126663 Hs.181297 



AA233225 
W84714 
AI417137 
AB002346 
AA699949 
W52355 
AA187045 
U14575 
R87647 
C00038 
AA488658 
U72649 
M191014 
D15050 
N27154 
N59645 
N66066 
N91246 
M1 14183 
AA126433 
AA858097 
AA127696 
U15932 
AA971000 
M761415 
AA488230 
AI023717 
D10537 
N45014 
AA426270 
AA432141 
N73705 
M23263 
W93523 
M27637 
R4S480 
H4596B 
AF008442 
AA236356 
AA417761 
M13241 
R42714 
AA609773 
H44825 
AA236843 
R20353 
AA598710 
AA215333 
X57348 
AA428990 
AA504262 
H98141 
AA282787 
AA256524 
F09315 
AA227219 
Z33910 
K09594 
U81599 
AA448488 
R49116 
AA236822 
AA430108 
AA160890 
N94527 
U14518 
U79242 
W92041 
M714635 



ESTs; Highly dndar to (define not ava 
ESTs 

Homo sapiens dona 24582 mRNA sequence 
Inositol phosphate S-phosphatase 2 (syn 
ESTs 
EST 

ESTs; Weakly similar to 1111 ALU SUBFAMI 
protein phosphatase 1; regulatory (mhfo 
ESTs 
ESTs 

neat shock 70kD protein 1 

Human BTG2 (BTG2) mRNA; complete cds 

ESTs; Weakly similar to Ydr372ep [S.cere 

Human mHNA for transcription factor AREB 

ESTs 

ESTs 

ESTs 

ESTs 



sorting nexin 4 
pigment epHhstwm-derived factor 
ESTs 

dual specificity phosphatase 5 

0p67g1U1 Soares_Najr_QBC_S1 Homo sapi 

nz22oms1 NCLCGAP_GC81 Homo sapiens cD 

ESTs 

ESTs 

myelin protein zero (Charcot-Marie-Tooth 
yy80g06Jl Soares_rnultlple_scterosis_2Nb 
ESTs 
ESTs 
ESTs 



ESTs 
ESTs 
cycUnK 
ESTs 

RNA polymerase I subunlt 
ESTs 

Homo sapiens done 24416 mRNA sequence 
v-myc avian myelocytomatosis viral relat 
EST 

Homo sapiens done 23904 mRNA sequence 
ESTs 

ESTs; Weakly similar to unknown [Sxerev 

yg2W10/l Soares infantbraln 1NI8 Homo 

ESTs 

ESTs 

straBin 

ESTs 

ESTs 

ESTs 

ESTs; Highly similar to (detune not ava 
Human DNA sequence from done 30M3 on ch 
discs; large (Drcsopn8a)homolog 5 
Homo sapiens CAGF9 mRNA; partial cds 
ESTs 

ESTs; Moderately similar to Hi! ALU SUB 
homeoboxB13 

ESTs; Weakly similar to zinc finger prot 
EST 

ESTs; Moderately similar to (defBne not 

ESTs 

myosin VI 

ESTs 

centromere protein A (1 7kD) 
Human done 23560 mRNA sequence 
PCAF associated factor 65 alpha 
ESTs 

110 



2$ 
25 
23 
23 
23 
23 
23 
23 
23 
23 
2.9 
2J 
23 
23 
2.9 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
2.8 
2.8 
2.8 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
23 
2.8 
23 
23 
23 
2.8 
23 
2.8 
- 2.8 
2.8 
2.8 
23 
23 
23 
23 
23 
23 
23 
2.8 
23 
23 
23 
2.8 
2.8 
23 
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104387 Hs.134342 H17438 ESTs; Weakly similar to seventransmembra 25 

107316 Hs.193700 T63174 ESTs; Moderately similar to till ALU SUB 2.8 

128059 HS.1450S8 AA972446 ESTs 25 

124447 N4800O ESTs 25 

5 111398 Hs.125565 R00086 deafness; X-biked1; progressive 25 

134085 Hs.79018 U20979 chromatin assembly factor I (160 kOa) 25 

124788 Hs.100912 R43S43 ESTs 25 

112248 H&326416 R51361 ESTs 25 

121309 Hs57312 AA402482 ESTs 25 

10 103076 Hs.75319 X59618 nbonudeoGda reductase M2 polypeptide 25 

107071 Hs55188 AA609053 ESTs 25 

104425 Hs55380 H8B496 ESTs 2* 

132991 Hs.62245 AA446906 solute carrier family 25 (mOochondrial 2.8 

104968 H&29669 AA084602 ESTs 25 

15 121153 Hs.97694 AA399640 ESTs 2.8 

131216 HS543901 D31058 ESTs 25 

109682 H&22869 F09299 ESTs 25 

131990 Hs.168818 H77734 ESTs; Moderately similar to roundabout 1 2.8 

132027 Hs.181444 N78844 ESTs; Weakly similar to R12C12.6 [Celeg 25 

20 127383 Hs.180478 AA447990 ESTs 2.8 

132598 Hs530 M81379 collagen; type IV; alpha 3 (Goodpasture 25 

101121 Hs.1313 L09753 tumor necrosis factor (figantf) supertaml 25 

123000 Hs.105640 AA479347 ESTs 25 

121329 Hs.1755 AA404324 ESTs 25 

25 100481 Hs.121439 HQ1098-HT1O98 CystaSnD 2-7 

113803 H&283683 W42789 ESTs 2.7 

110934 Hs.169001 N48708 ESTs; Weakly similar to cytochrome P-450 2.7 

432888 T86823 ESTs 2.7 

121802 Hs.188898 AA424328 ESTs 2.7 

30 130396 Hs.155313 AB002331 Human mRNA for K1M0333 gene; partial cd 2.7 

121103 Hs.97697 AA398936 ESTs; Weakly similar to (deffrne not ava 2.7 

131129 Hs^3240 R27296 ESTs 2.7 

130943 Hs£72429 D50855 calcium-sensing receptor (hypocalciilric 2.7 

134676 HS57819 W28051 ESTs; Weakly similar to keratin 9; cytos 2.7 

35 111900 Hsi5318 R39044 ESTs 2.7 

106025 Hs.173334 AA412063 ESTs 2.7 

126144 Hs.40639 N39696 yx32a07 jl Soares melanocyte 2NbHM Homo 2.7 

103248 Hs.75262 X77383 calheps'mO 2.7 

127230 Hs£74170 H30501 Homo sapiens Opa-lnferac8ng protein OIP 2.7 

40 101584 Hs.84072 M35252 transmembrane 4 superfamily membar 3 2.7 

124131 Hs.167489 H19980 ESTs 2.7 

129689 Hs.77873 M130156 ESTs 2.7 

132892 HSJ973 W92797 ESTs 2.7 

120827 Hs.132967 AA347717 ESTs 2-7 

45 134579 HS55963 N23222 ESTs; Moderately similar to till ALU SUB 2.7 

106149 H&258301 AA424881 ESTs 2.7 

132037 Hs532541 AA203649 ESTs; Weakly similar to HEM45 [H^apiens 2.7 

130542 Hs.179825 U64675 Human sperm membrane protein BS-63 mRNA, 2.7 

122851 Hs59593 AA463627 ESTs 2.7 

50 134983 Hs.196384 028235 prostaglandin-enrjoperoxide synthase 2 (p 2.7 

120537 Hs.160422 AA262790 ESTs 2.7 

131036 Hs.174140 X64330 ATP citrate lyase 2.7 

133889 HS211582 AA099391 ESTs • 2.7 

128847 Hs.106529 AA424199 w81e0lJl ScaresJotaLfetusJTD2HF8_9w 2.7 

55 112755 HS506044 R938Q2 ESTs 2.7 

423239 AA323591 EST26392 Cerebellum II Homo sapiens cONA 2.7 

105031 Hs.12321 AA127240 ESTs 2.7 

126021 Hs.187516 AA775894 ESTs 2.7 

102116 U13706 Human ELAV-Ske neuronal protein 1 isofo 2.7 

60 133394 H&237225 R16759 ESTs;WealdysMarto(deflinenotava 2.7 

104267 Hs278439 C00358 ESTs 2.7 

107614 Hs.40241 AA004878 ESTs; Highly similar to (deffrne not ava 2.7 

129809 Hs.1259 X55283 aslaloglyooproteln receptor 2 2.7 

112109 H&283309 R45221 ESTs; Weakly similar to Oil ALU SUBFAMI 2.7 

65 128422 T85681 yd60c06j1 Soares fetal Dver spleen 1NF 2.7 

109494 H&43899 AA233702 ESTs 2.7 

118696 H&292284 N72086 Homo sapiens RNA polymerase III largest 2.7 

106053 Hs56727 AA416963 ESTs; Highly similar to histone H2A (H.S 2.7 

104440 H&284380 120492 gamma-glutarnyflransferasa 1 2.7 
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129426 Hs.111323 AM12087 EST; Highly similar to (deffina not aval 2.7 

123738 AA620411 small inducible cytokine A5(F1ANTES) 2.7 

106716 Ha238928 AA464962 ESTs 2.7 

103663 278291 Z78291 Homo sapiens brain fetus Homo sap 2.7 

5 114162 H&22265 Z38909 ESTs 2.7 

113063 H&5027 T32438 ESTs 2.7 

127897 AA773857 a(80o09.r1 Soares_NhHMPu_S1 Homo sapiens 2.7 

130621 Hs.16803 AA621718 ESTs; Weakly similar to (define not ava 2.7 

116245 Hs.42796 ' AA479958 ESTs; Highly similar to (daflina not ava 2.7 

10 125499 R11878 yt49d1 1 A Soares infant brain 1 NIB Homo 2.7 

133960 Hs.77899 M19267 tropomyosin 1 (alpha) 2.7 

104470 Hs£46358 N28843 ESTs; Weakly similar to Similar to ootla 2.7 

134982 Hs£2308 N46086 ESTs 2.7 

108803 HS284295 AM79114 ESTs 2.7 

IS 104899 Hi285574 AA054726 ESTs 2.7 

125401 HsJ37585 AI204637 ESTs; Moderately similar to WAA0350 [H. 2.7 

111253 Hs.15768 N70042 ESTs; Moderately similar to Ml ALU SUB 2.7 

118449 Hs.164478 N66413 ESTs; Weakly similar to (defile not ava 2.7 

134507 H&84318 M63488 repBcafion protein A1 (70kD) 2.7 

20 121609 H&98185 AA416867 EST 2.7 

113835 Hi27475 W56590 ESTs 2.7 

113962 H&285290 W86375 ESTs; Highly similar to (deffina not ava 2.7 

121913 Hs.98558 AA428062 ESTs 2.7 

108194 Hs216717 AAD57250 ESTs 2.7 

25 130799 Hs.12696 AA464273 ESTs 2.7 

123184 Hs.18166 AA489072 Homo sapiens mRNA for K1AA0870 protein; 2.7 

103420 Hs.173497 X97065 SEC23-lika protein B 2.7 

106186 Hs£315 AA427398 axstylserotonlnN^tlryltransferase-tike 2.7 

101349 L77559 Homo sapiens DGS-8 partial mRNA 2.7 

30 112954 H&£655 T16559 ESTs 2.7 

133054 Hs£91079 R07876 ESTs; Weakly similar to unknown (S.cerev 2.7 

128131 H&25640 AI283162 daudin3 2.6 

101864 Hs.75777 M95787 transgelin 2.6 

111948 H&26303 R40752 ESTs 2.6 

35 130145 Hs.151051 U07620 protein kinase mitogen-actKrated 10 (MAP 2.6 

126507 Hs23964 AI362218 ESTs 2.6 

117903 Hs.47111 N50740 ESTs 2.6 

116345 Hs.199067 AA496981 ESTs 2.6 

132227 Hs.4248 AA412620 ESTs 2.6 

40 125746 HS274256 H03574 yj42M6/1 Soares placenta Nb2HP Homo sa 2.6 

105073 HsJ9463 AA137034 ESTs 2.6 

102764 U82310 Homo sapiens unknown protein mRNA, parti 2.6 

131367 Hs.173933 AA456687 ESTs 2.6 

130792 Hs.19500 AA307896 nuclear localization signal deleted In v 2.6 

45 107427 Hs.46736 W26975 ESTs Z£ 

117477 Hs^4175 N30328 ESTs 2.6 

106290 Hs.16364 AA435542 ESTs 2.6 

126829 Hs.7910 R11547 ESTs 2.6 

118836 Hs.173001 N79820 ESTs 2.6 

50 100147 Hs.136348 D13666 osteoblast specrfic factor 2 (tasddin 2.6 

104278 Hs.109253 C02582 ESTs; Highly similar to (defline not ava 2.6 

135051 Hs£3484 C15324 ESTs 2.6 

126081 H&227835 AI346024 collagen; type I; alpha 1 - 2.6 

123579 AA608983 af5d4.s1 Soares_testis_NHT Homo sapiens 2.6 

55 130115 Hs.149923 M31627 X-box binding protein 1 2.6 

101434 Hs.1430 M20218 coagulation factor XI (plasma thrombopla 2.6 

122962 Hs.104720 AA478429 ESTs; Moderately similar to Bll ALU SUB 2.6 

126151 Hs.40808 AA324743 ESTs 2.6 

128925 HSJ21851 061676 Homo sapiens mRNA; cONA DKFZp586J21 1 8 (1 2.6 

60 128919 Hs.103391 L27559 IrisuDn-likB growth factor binding prote 2.6 

130296 Hs.154103 R09286 UMproteln (similar to rat protein kina 2.6 

128402 Hs.191637 AA457244 ESTs 2£ 

129273 Hs.109968 W63783 ESTs 2.6 

125483 Hs.7788 F07759 ESTs 2.6 

65 132953 Hs.321264 AA029927 ESTs 2j6 

130963 Hs.21639 U57099 nuclear protein; marker for differentiat 2£ 

120614 Hs.194154 AA284281 ESTs; Weakly similar to llll ALU SUBFAMI 2.6 

123251 Hs.103267 AA490858 ESTs; Moderately similar to Rabin3 [Rjw 2.6 

121710 Hs.96744 AA419011 ESTs 2£ 
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125428 H&B51 W74608 ESTs; Highly similar to (deffrte not ava 25 

115906 Hs*2302 AA436616 ESTs 2-6 

108432 AA076626 Homo sapiens done 23851 mRNA sequence 2£ 

126191 Hs.191911 H97728 ESTs 2.6 

5 106164 HS281434 AA425773 ESTs 2-6 

111519 HsJ268615 R08165 ESTs 2.6 

134590 Hs.173840 W58612 ESTs 2.6 

10256S U59748 Human desert hedgehog (hDHH) mRNA, parti 2.6 

129879 Hs.13109 AA194973 ESTs 2.6 

10 114264 Hs334609 Z40074 ESTs 2.6 

106236 H&21104 AA429951 ESTs 2^ 

135192 H&321709 AF000234 purfnerglc reciter P2X; Bgand-gated b 2.6 

109833 H&29889 H00580 ESTs 25 

105756 K&8535 AA303088 EST s; WeaHy similar to transformatiorH 2.6 

15 121422 HSS7S67 AA406210 ESTs 2£ 

130417 Hs.155485 U58522 Human huntingSn interacting protein (HI 2.6 

124312 Hs.102329 H94647 ESTs 2£ 

108998 H&97199 M156058 ESTs 2.6 

127081 Hs.180591 H88362 ESTs; Weakly similar to weak simllarily 2.6 

20 129574 Hs.11463 AA458603 ESTs; WeaMy similar to (deffins not ava 2.6 

112410 H&26904 R61680 ESTs 2.6 

123929 Hs.112981 AA621364 ESTs 2.6 

122905 Hs.104835 AA470070 ESTs 2.6 

116399 Hs.110637 AA599729 Homo sapiens homeobox protein A10 (HOXA1 2.6 

25 130279 Hs.153934 AA424044 core-binding factor runt domain; alpha 2.6 

130021 Hs.1435 M24470 guanosine monophosphate reductase 2.6 

100585 Hs.199160 HG2367-HT2463 TrifootaxHornokigHrx 2-6 

104965 HSJ0177 AA084104 ESTs 2.6 

117711 Hs.46485 N45201 EST 2.6 

30 124782 HiU8712 R44357 ESTs 2.6 

111299 Hs.74313 N73808 ESTs 2.6 

103616 Hs32971 Z46973 -phosphoinositide-S-klnase; class 3 2.6 

133629 Hs.195614 D13642 KIAA0017 gene product 2.6 

126484 Hs.169977 AI086782 ESTs 2.6 

35 100858 HG424S-HT4515 Forkhead Family Abel 2.6 

133547 Hs501927 X02883 T-cell reoeptor; alpha (V;D;J;G} 2.6 

126680 Hs.133865 FO7097 ESTs 2.6 

125739 Hs.92137 AA428557 v-myc avian myelocytomatosls viral oncog 2.6 

102276 Hs.10247 U30999 Human (mernc) mRNA, 3VTR 2£ 

40 105586 Hs.191538 AA279137 ESTs 2.6 

103978 Hs.34136 AA307443 ESTs 2£ 

125054 Hs.268601 T80622 ESTs; Weakly slmHarto (defline not ava 2.6 

114212 Hs21201 239338 ESTs; Highly similar to (defline not ava 2.6 

116959 Hs.40022 H79310 EST 2.6 

45 109228 Hs.306995 AA193366 ESTs 2.6 

133989 Hs.78202 U29175 SWI/SNF related; matrix associated; acS 2.6 

100640 Hs.182183 HG2743+IT2845 Caldesmonl, AILSp2ce3,Non-Musde 2.6 

133093 Hs285996 AA598749 ESTs 2-6 

114306 Hs.6540 Z40861 ESTs 2j6 

50 106060 Hs.171391 AA417287 Otermrhal binding protein 2 2.5 

107748 Hs.60772 AA017258 EST 2^ 

100134 Hs.49 D13264 macrophage scavenger receptor 1 2.5 

133969 Hs.78 U13044 GA-binding protein transcription factor, - 2.5 

130992 Hs.74316 AA455001 ESTs 2.5 

55 127493 Hs291701 AA808081 oc39a08.s1 NCI_CGAP_GCB1 Homo sapiens cD 2.5 

132869 Hs.203981 N26855 ESTs 2.5 

117570 Hs.44583 N34415 EST 2.5 

124644 Hs.109654 N91279 ESTs 2.5 

103558 H&2785 Z19574 keratin 17 2.5 

60 132883 HsS897 AA047151 ESTs 2JS 

102009 Hs£2643 U02680 protein tyrosine kinase 9 25 

116058 HS20159 AA454156 ESTs 2.5 

121989 Hs.193784 AA430044 ESTs 2.5 

131257 Hs24908 AA256042 ESTs 25 

65 100320 Hs.75275 D50916 homotog of yeast (S.cerevisiae)ufrJ2 25 

102959 Hs.121524 X15722 glutamione reductase 25 

132969 Hs.6166 AA047616 ESTs 25 

130869 Hs2057 AA128100 uridine monophosphate synthetase (orotat 25 

129645 Hs.1 18131 L33928 5;10^fhenytetrahydrofolate synthetase 25 

113 



WO 02/30268 



PCT/US01/32045 



126399 H&83883 M128075 
134069 Hs.78935 U29607 
109816 Hs£1960 F11013 
134801 H&89695 X02160 
104232 Hs.10587 AB0Q2351 
107361 Hs.159486 U72513 
106057 H&289074 AA417067 
1342S2 Hs£0720 AA031782 
128062 Hs.105547 AA379S00 
110009 Hs.6614 H10933 
111375 H&20432 N93696 
122642 H&99361 AA454186 
127999 H&69851 AA837495 
105029 Hs.13268 AA126855 
105082 H&26765 AA143763 



Zt16d08/1 SoaresjiregnanLuteruaJIfaHPU 25 

Homo sapiens elF-2-associated p67 homok) 25 

ESTs; Weakly similar t)KlAA0176[H^apl 25 

Insulin receptor 25 

Human mRNA for WAA0353 gene; parBalcd 25 

Human RPL1 3-2 pseudogene mRNA; complete 25 

ESTs 25 

Homo sapiens mRNA; cDNA DKFZp586B1722 (f 25 

ESTs 25 

ESTs 25 

ESTs 25 

ESTs 25 

ESTs; Weakly similar to Wiskott-Aldrich 23 

ESTs 25 

ESTs; Weakly slmBar to Similarity to S. 25 
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TABLE 1 A show the accession numbers for those primekeys lacking unigenelD's for Table 
1. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Ptey. Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Ptey CAT number Accessions 

108552 111555J AAD71210 AA069899 AA071438 AA084912 AA084803 AA079371 AA079370 

126023 1596090J H57661 H58881 

126088 1606216 1 H756B1 H70975 

102565 32479 1 AB010994 1159748 AA064660 

101964 48158_-7 S81578 

125499 1562851 1 H10543R11878 

125596 1708455J R25698 R56582 R56018 

118417 37186 1 AF08Q229 AF080231 AF08Q230 AF030232 AF080233 AF030234 BE550633 AI636743 AW614951 BE457547 AB80833 

A1633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW20S802 AI970376 AI583718 AI672574 
N25695 AWS65466 AI818326 AA126128 AI480345 AW013827 AA248838 AB14968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AI631703 AK51023 AI887418 AWB18140 AA502500AI206199AB71282 
AI352545 BE501030 AI652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
' AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 

BE466611 AI206344 AA574397 AA34B354 AI493192 

125661 327827 1 AA491830 R50173 R55192 R50320 AI732306 AI732305 A1820727 AI820728 R55191 R50319 R50227 

125957 158354SL1 H41694 H45213 

125982 1766315J R98091 W92898 

127248 227560 1 AA364195 AA325029 AW962050 

103731 112052 1 AA070545 M131490M131373 

127261 231687 1 AA330501 AA661567 

127265 232391J AA331503 AA332751 AW982542 

126659 1541209 1 T16245 R19694 F13545 H10299 T66048 T65279H18006 

127315 37938 1 AF1 16622 All 14507 AA640834 AA377999 

103806 112618 1 AA130614AA071410 

128104 502608 1 AA906093 AA971000 

104602 524482J K47610R86920 

128152 297868J F07973R20353 AA442660 

128422 1811283J T77794T85681 

127897 446527 1 AA773681 AA773857 

106566 120358J BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 AI333584 

AI369742 AI039S5B AI885095 AI476470 AI287650 AI885299 AI985381 AW592624 AW340136 AE66556 AA456390 
AI310815AA484951 

129735 44573J AI950087 N70208 R97040 N38809 AI3081 19 AW967677 N35320 A1251473 H59397 AW971573 R97278 W01O59 

AW967671 AA908598 AA251875 AI820501 AI820532 W87891 T85904 U71456 TB2391 BE328571 T75102 R34725 
AA884922 BE328517 AE19788 AA884444 N92578 F13493 AA927794 AI560251 AW874058 AL134043 AW235363 
AA663345 AW008282 AA483964 AA283144 AI890387 AJ950344 AI741346 AI689062 AA282915 AW102898 AI872193 
AI763273 AW173586 AW150329 AI653832 AI762688 AA988777 AA488892 AI356394 AW103813 AI539642 AA642789 
AA856975 AW505512 AI961530 AW629970 BE612881 AW275997 AW513601 AW512843 AA044209 AW856538 
AA180009 AA337499 AW961101 AA251669 AA251874 AI819225 AW205862 AI683338 A1858509 AW276905 AI633006 
AA972584 AA908741 AW072629 AW513996 AA293273 AA969759 N75628 N22388 H84729 H60052 T92487 AI022058 
AA780419 AA551005 WB0701 AW613456 AI373032 AI564269 F00531 H83488 W37181 W78802 R56056 AW02839 
R67840 AA300207 AW959581 T63226 FO4O05 

123147 219802 -2 AA487961 

130529 158447J AA178953 AA192740 

123579 genbank_AA608983 AA608983 

109175 genbanKJVA180496 AA180496 

100789 tigr_HT4163 S67998 

100858 8gr_HT4515 U10072 
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123798 


579959_1 


AA620411 AA287491 


102116 


efltra?_U137Q6 


U13706 


102398 


■ I, , ,- * lit 

antra3L.lW2359 


1142359 


102764 


entra?_u82310 


U82310 


118475 


geflbanK_N66845 


N66845 


104776 


genbank_AAC26349 


AA026349 


104787 


ganbanleAA027317 


AA027317 


lloAJE 


genoani^. i »/ou/ 




113938 


ganbank_W81598 


W81598 


122635 


genbank_AA454085 


AA454085 


108407 


genbank_AA075519 


AA075519 


108432 


ganbanK_AAD7B626 


AA076626 


108555 


g«ibanlUW084963 


AA084963 


101349 


entre?_L77559 


L77559 


124447 


genbankJM8000 


N48000 


119071 


ganbanK_R31180 


R31160 


103520 


entte?_Y10511 


Y10511 


1036S3 


gaibanK_278291 


278291 


128046 


877605J 


AA873285AI025762 


126959 


546044J 


AA199853AA206355 


123465 


genbanK_AA599033 


AA599033 
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TABLE 2: shows a preferred subset of the Accession numbers for genes found in Table 1 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey: 
ExAccn: 



Unigene Title: 
R1: 



Exemplar Accession number, Genbank accession number 
Unigene number 



Ratio of tumor to normal body tissue (Relaxed ratio (87/70) 



Pkey ExAccn 


UnlgenelD Unigene Title 


HI 


131919 M121266 


H&27245B ESTs 


372 


120328 M196979 


Hs290905 ESTs; Weakfy similar to (defline not ava 


32.6 


101486 M24902 


Hs.1852 edd phosphatase; prostate 


252 


119073 R32894 


HS279477 ESTs 


243 


133428 M34376 


Hs.183752 mlorosemlnoprotein; beta- 


23.8 


128180 AA595348 


Hs.171995 kaMrein 3; (prostate specitte antigen 


2U 


104080 AA402971 


Hs57771 -Homo sapiens mRMA tor serine protease (T 


185 


127537 AA569531 


Hs.162859 ESTs 


18j6 


131665 R22139 


H&30343 ESTs 


174 


101050 K01911 


Hs.1832 neuropeptide Y 


173 


130771 N48056 


Hs.1915 folate hydrolase (prostate-specific memb 


17 


107485 W63793 


Hs262476 S^fenosybnethionlne decarboxylase 1 


16.7 


106155 AA425309 


Hs33287 ESTs 


165 


129534 R73640 


Hs.11260 ESTs 


164 


100569 KG2261-HT2351 


Antigen, 


101889 S39329 


Hs.181350 katfrkrein2;prostafic 


15.4 


135389 U05237 


Hs59872 fetal Alzheimer antigen 


15 


133944 AA045870 


Hs.7780 ESTs 


125 


130974 X57985 


Hs2178 H2B hisfone family; member Q 


113 


114768 AA149007 


Hs.182339 ESTs 


113 


104660 AA007160 


Hs.14846 ESTs 


11.4 


131061 N64328 


Hs268744 ESTs; Moderately similar to KIAA0273 [H. 


105 


126645 AI167942 


Hs51635 Homo sapiens BAG done RG041D1 1 from 7q2 10 J 


135153 N40141 


H&95420 Homo sapiens mHNA far JM27 protein; comp 10.6 


107033 AA599629 


Hs.113314 ESTs 


10j6 


118417 N66048 


ESTs; Weakly similar to polymerase [H.sa 


105 


126758 W37145 


Hs293960 ESTs 


102 


107102 AA609723 


Hs.30652 ESTs 


10.1 


116787 H28581 


Hs.15641 ESTs 


10.1 


115719 AA416997 


Hs59622 ESTs 


10 


123209 AA489711 


H&203270 ESTs 


95 


101664 M60752 


Hs.121017 H2A histona family; member A 


93 


112971 T17185 


Hs.83883 ESTs 


9.7 


117984 N51919 


Hs.106778 ESTs 


9.7 


129523 M30894 


Hs274509 T-cell receptor gamma cluster 


9.4 


132964 AA031360 


Hs.167133 ESTs 


92 


121853 AA425887 


Hs.98502 ESTs 


9 


119617 W47380 


HS55999 ESTs 


8.9 


105627 AA281245 


Hs23317 ESTs 


83 


101461 M22430 


Hs.76422 phosphoHpaseA2; group IIA (platelets; 


8.7 


124526 N62096 


Hs293185 yz61c5.s1 SoaresjnultJple.selerosisJNbH 


85 


133845 T68510 


Hs.76704 ESTs 


82 


133354 AA055552 


H&334762 ESTs; Weakly similar to KIAA0319 [Ksapl 


8.1 


119018 N95796 


Hs278695 ESTs 


8 


100394 D84276 


Hs.66052 CD38 antigen (p45) 


8 


106579 AA456135 


Hs23023 ESTs 


7.8 


114965 AA250737 


Hs.72472 ESTs 


7.4 


112033 R43162 


H&22627 ESTs 


7.1 


102398 U42359 


Human N33 protein form 1 (N33) gene, exo 


7 


101201 122524 


H&2256 matrix metaHoprotabase 7 (maWysin; 


65 


101803 M8B546 


Ha.155691 pre-B-ceB leukemia transcription factor 


63 


120562 AA280036 


H&302267 ESTs;WeakrysimtlartoW01A6jC[C.efega 


65 
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109112 AA169379 H&257924 ESTs 63 

109795 F10707 Hs326416 ESTs 6.7 

130336 X07730 Hs.171985 kalEkrein 3; (prostata spedHcanfigen 6.6 

131425 AA219134 Hs.26691 ESTs 6.6 

5 132902 AA4909S9 Hs39838 ESTs 63 

133724 U07919 Hs.75746 aldehyde dehydrogenase 6 63 

120215 241050 Hs.108787 Homo sapiens Mcd4p homolog mRNA; comptet 63 

131881 AA010163 Hs.3383 upstream regulatory element btnaTng prot 63 

100727 X07290 Hs334786 Human HF.12 gene mRNA 63 

10 121770 AM21714 H&278428 Horro sapiens mRNA for KIAA0896 protein; 63 

123475 AA599267 Hs^50528 ESTs; WeaMy similar to ANKYRIN; BRAIN V 63 

133061 AB000584 H&286638 prostata tDfferenQafion (actor 63 

116429 AA609710 H&279923 ESTs; WeaWy similar to similar to GTP-b 6.2 

101233 129008 Ks378 sorbitol dehydrogenase 62 

15 104691 AA011176 Hs37744 ESTs 62 

127248 AA325029 EST27953 Cerebellum II Homo sapiens CDNA62 

105500 AA256485 H&222399 ESTs &1 

130828 AA053400 H&203213 ESTs 55 

115357 AA281793 Hs.72988 ESTs 53 

20 116334 AA491457 Hs.48948 ESTs 5.7 

120132 238839 Hs.125019 ESTs; Weakly similar to UH ALU SUBFAM1 5.6 

106375 AA443993 Hi289072 ESTs 5.6 

124777 R41933 Hs.140237 ESTs; Weakly similar to neuronal thread 5.6 

101791 M83822 H&62354 .Human bekje-Gke protein (BGL) mRNA; par 53 

25 117698 N41002 Hs.45107 ESTs 53 

122041 AA431407 H&98732 Homo sapiens Chromosome 16 BAG clone CIT S3 

133723 AA088851 H&262476 S-adertosytmathtonlne decarboxylase 1 53 

113938 W81598 ESTs 5.4 

133015 AA047036 H&246315 ESTs 5,4 

30 108186 AA056482 Hs.7780 ESTs 53 

104466 N25110 Hs326392 Human guanine nucleotide exchange (actor 53 

104033 AA365031 Hs38944 ESTs 53 

110844 N31952 Hs.167531 ESTs; Weakly similar to (define not ava 53 

129056 H70627 Hs.108336 ESTs; Weakly similar to fill ALU SUBFAMI 53 

35 133493 AA284143 Hs.194369 Homo sapiens chromosome 1 atrophln-1 rel 53 

129184 W26769 Hs.109201 ESTs; Highly similar to (define not ava 52 

101448 M21389 Hs.195850 kerafin 5 (epidermolysis bullosa simplex 5.1 

116188 AA464728 Hs.184598 ESTs; Weakly similar to Oil ALU SUBFAMI 5.1 

105921 AA402613 Hs.169119 ESTs 5.1 

40 103375 X91868 Hs34416 sine oculis homeobox (DrasophDa) homoto 5.1 

128871 AA400271 Hs.106778 ESTs; Highly similar to (deffine not ava 5.1 

116238 AA479362 H&47144 ESTs 5 

102913 X07696 Hs.80342 keratin 15 6 

103011 X52541 Hs326035 eady growth response 1 6 

45 118981 N93839 Hs39288 ESTs; Weakly similar to liil ALU SUBFAMI 5 
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TABLE 2A shows the accession numbers for those primekeys lacking unigenelD's for Table 
2. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from GenbankESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos protasst Identifier number 

CAT number: Gena duster number 

Accession: Genbank accession numbers 



Pkey 



CAT number Accession 



118417 37186 1 AFD8Q229 AFC80231 AF080230 AF08Q232 AFC80233 AF080234 8E550633 AI638743 AW614951 BE467547AI680833 

AI633818 N29986 U87S92 U87593 U87590 UB7591 S46404 U87587 AA463992 AW2068Q2 AB70376 AI583718 AI672574 
N25695 AW665466 AI818326 AA126128 AI480345 AWD13827 AA248638 AJ214968 AA204735 AA2071S5 AA206262 
AA204833 AW003247 AW49B808 AI080480 AI631703 A1651023 AI867418 AW818140 AA502500 AI206199 AI671282 
AI352545 BE501030 AI652535 BE465762 AA208331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703398 H92278 AW139734 H92683 U87589 U87595 K69001 U87594 BE466420 AI624817 
BE466S1 1 AE06344 AA574397 AA348354 A1493192 
227560 1 AA364195 AA325023 AW362050 

235652 J AI141899 M730176 R44544 R41778 AW300793 AW966157 AA918S01 AA599629 AK382195 AI198537 AW006520 

AW236663 AW151420 A1826987 A1810832 A1669102 AK01981 N27331 AA335566 T84622 BE085347 BE085269 
entre%.U42359 U423S9 



127248 
107033 

102398 
113938 



genbank_W81598W81598 
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TABLE 3: shows genes, including expression sequence tags, differentially expressed in 
5 prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos Hu02 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 
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65 



Ptey: 

ExfiCCK 

UnkjenelD: 



R1: 



Unique Eos probeset ktenffler number 

Exemplar Accession number, Genbank accession number 

Unlgene number 

Unlgene gene Sue 

Ratio of tumor to norma) body tissue 



Pkey ExAccn UnlgenelD Unlgene Title 



100131 
100235 
100570 
100819 
101063 
101247 
101416 
101447 
101485 
101514 
101626 
101663 
101758 
101768 
101817 
101888 
102031 
102052 
102221 
102233 
102302 
102348 
102457 
102473 
102669 



012485 
D29954 



Hs.11951 
Hs.13421 



HG2261-HT2352 
HG4020+tT4290 
L00354 H&80247 



phosphodiesterase Imudeofide pyrophosp 

K1AA0056 protein 

Hs.171995 

Hs£387 



102751 
102823 



103031 
103043 
103093 
103376 
103401 
103613 
103677 
103962 
104084 
104257 
104301 
104769 
104851 
1048S6 
104956 
104957 
104967 
105099 
105298 



L33801 

M17254 

M21305 

M24736 

M28214 

M57399 

M60750 

M77836 

M81118 

M88163 

M99701 

U04898 

U07559 

U24576 

U26173 

U33052 

U37519 

U48807 

U49957 

U71207 

U75272 

U80034 

U90914 

X02544 

X54667 

X55733 

X60708 

X92098 

X95240 

Z46629 

Z63806 

AA298180 

AA410529 

AF006265 

D45332 

AA025887 

AA0408B2 

AA054228 

AAD74880 

AA074919 

AA084506 

AA150776 

AA233459 



Hs.78802 
Hs.279477 

Hs.89546 

Hs.123072 

Hs.44 

HS5178 

Hs.79217 

Hs.78989 

Hs.152292 

Hs45243 

Hs.2156 

Hi505 

Hs.3844 

Hs.79334 

Hs.69171 

Hs.87539 

H&2359 

Hs.180398 

HS29279 

Hs.1867 

Hs.68583 

Hs5057 

Hs£72 

Hs.123114 

H&93379 

Hs.44926 

Hs323378 

Hs.54431 

Hs.2316 

Hs£3243 

Hs.30732 

Hs5222 

Hs.6783 

H&293943 

Hs.10290 

H&23165 

Hs£0509 

Hs.10026 

Hs.291000 

Hs.23729 

Hs.26369 



glycogen synthase kinase 3 beta 
v-ets avian erythroblastosis virus E26 o 
Human alpha satellite and sateffie 3 Ju 
selecfin E (endothelial adhesion molecul 
RAB3B; member RAS oncogene family 
pleiotraphln (heparin binding growth fac 
H2B histone family; member A 
pyrrorine-5-carboxylata reductase 1 

SW1/SNF related; matrix associated; acfi 

transcripfion elongation factor A (SM)- 

RAR-retated orphan receptor A 

ISf.1 transcription factor; UM/homeodoma 

UM domain only 4 

nuclear factor; interleuldn 3 regulated 

protein kinase Olike 2 

aldehyde dehydrogenase 8 

dual specificity phosphatase 4 

UM oorraln-contalning preferred transloc 

eyes absent (DrosophQa) homotog 2 

progastricstn (pepsinogen C) 

mitochondrial Intermediate peptidase 

carboxypeptidase D 

orosorrrucoid 1 

cystatjnS 

eukaryotic translation initiation factor 
dipeptidytpeptidase IV (CD26; adenosine 
coated vesicle membrane protein 
specie granule protein (28 kDa); cyste 
SRY (sex-determining region Y)*ox 9 (ca 
H sapiens mRNA tor axonemal dynein heavy 
ESTs 
ESTs 

estrogen receptor-binding fragment-assoc 
ESTs 

ESTs; Weakly similar to liil ALU SUBFAMI 
U5 snRNP-spedlic 40 kDa protein (hPrp8- 
ESTs 

ESTs; WeaWy similar to hypothetical pro 
ESTs; Weakly similar to ORF YJL063C [S.c 
ESTs 

Homo sapiens done 24405 mRNA sequence 
ESTs 



R1 

6J3 
5.1 

Antigen, Prostate Specific All SpBce 

Transglutaminase 105 

8S 

4.7 

4.7 

11 

9.8 

6J2 

8.4 

4.9 

5.4 

IS 

5.5 

5.7 

13.2 

8.9 

5.6 

7A 

B2 

5.9 

5.1 

5.7 

9 

10.6 

15.6 

4.8 

22.6 

4.7 

45 

6.8 " 

5.2 

7A 

52 

43 

6 

64 

6.8 

105 

63 

4.9 

5.8 

6.4 

43 

6S 

7 

5.1 



121 



WO 02/30268 



PCT/US01/32045 



105304 AA233553 Hs.190325 ESTs 4.7 

105370 AA236476 H&22791 ESTs; Weakly similar to transmembrane pr 103 

105427 AA251330 HsX8246 ESTs 5 

105542 AA261858 Hs.266957 ESTs; Weakly similar to heat shock prate 8.6 

5 105628 AA281251 Hs.79828 ESTs; Weakly similar to putafive zinc fl 55 

105640 AA281623 Hs.6685 ESTs; Weakly similar to WAA0742 protein 8 

105645 AA28213B Hs.11325 ESTs 14 

105691 AA287097 H&289058 transcription (actor 4 65 

105730 AA292701 Hs5364 DKFZP564I052 protein 45 

10 105808 AA393808 H&286131 KWATJ438 gene product 7 

105826 AA398243 Hs.194477 ESTs; Moderately similar to slmSar to N 5 

105903 AA401433 Hs£00016 ESTs; Weakly similar to Cfyhosphohostto 95 

105906 AA401633 H&22380 ESTs 115 

106065 AA417558 Hs55206 ESTs 5.1 

IS 106094 AA419461 H&23317 ESTs 105 

106157 AA4253S7 Hs54892 ESTs 63 

106184 AA426643 Hs.10762 ESTs 63 

106211 AA42B240 Hs.126083 ESTs &4 

106213 AA428258 Hs5769 Homo sapiens mRNA; cOMA DKFZp564E153 (fr 5.7 

20 106272 AA432074 Hs523099 ESTs 55 

106369 AA443828 Hs288856 ESTs 65 

106400 AA447621 HS54109 ESTs 5A 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cONA DKFZp564C053 (tr 92 

106507 AA452584 H&267B19 .protein pWratasel; regulatory (Wiib 55 

25 106523 AA453441 Hs51511 ESTs 4.7 

106532 AA453628 Hs57443 ESTs 4.7 

106557 AA455087 Hs22247 ESTs 5.7 

106575 AA456039 Hs.105421 ESTs 72 

106618 AA459249 Hs5715 ESTs; Weakly similar to Similarity with 55 

30 106820 AA481037 Hs.12592 ESTs 5A 

106846 AA485223 Hs54392 ESTs 55 

106973 AA505141 Hs.11923 Human DNA sequence from clone 167A19 on 75 

107110 AA609952 Hs.12784 KIAA0293 protein 6.1 

107127 AA620504 Hs.179898 ESTs 7.1 

35 107159 AAS21340 Hs.10600 ESTs; Weakly similar to ORF YKR081 c [S.c 52 

107217 D51095 Hs55861 DKFZP586E1621 protein 15.1 

107365 U78294 Hs.111256 arachidonate 15-Cpoxygenase; second typ 4.7 

107630 AA007218 Ks.60178 ESTs 55 

107734 AA016225 Hs.7517 ESTs 45 

40 107760 AA018042 Hs£52085 EST 75 

107997 AA037388 Hs52223 Human DNA sequence from done 141H5 on c 105 

108012 AA039616 Hs.173334 ESTs 65 

108520 AA084138 Hs.46786 ESTs 75 

108583 AA088276 Hs58826 ESTs 5.6 

45 108613 AA100967 Hs.69165 ESTs 6 

108664 AA1 13349 Hs.69588 EST 65 

10S677 AA115629 Hs.118531 ESTs 55 

108807 AA129968 Hs49376 ESTs; Weakly similar to PROTEIN PHOSPHAT 5.8 

108910 AA136590 ESTs 5 

50 108933 AA147224 Hs537232 ESTs 12.7 

108948 AA149579 Hs.118258 ESTs 65 

109014 AA156780 Hs.262036 ESTs 155 

109124 AA171529 Hs.183887 ESTs 6.1 ' 

109142 AA176438 Hs.41295 ESTs 5.1 

55 109277 AA196332 Hs.86043 ESTs 55 
109342 AA213620 Homo sapiens mRNA;cONA DKFZp58SM1418 (16 

109562 F01811 Hs.187931 ESTs; Moderately similar to voltage-gate 105 

109565 F01930 Hs.23648 ESTs 7 

109648 FO46O0 Hs.7154 ESTs 95 

60 109799 F10770 Hs.180378 Homo sapiens clone 669 unknown mRNA; com 64 

109859 K02308 H&20792 ESTs 55 

110181 H20276 HS51742 ESTs 165 

110854 N32919 H&27931 ESTs 10 

110924 N47938 Hs.12940 yy84a09.s1 Soares_muttiple_sclerosis_2Nb 5.6 

65 111046 N55514 Hs518584 ESTs 65 

111091 N59858 Hs53032 Homo sapiens mRNA; cONADKFZp434N185(fr5i 

111157 N66613 Ks59364 ESTs 5 

111164 N66857 Hs.122489 ESTs; Weakly similar to !U1 ALU CLASS C S.6 

111221 N68869 Hs.15119 ESTs 62 
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111348 N90041 HS3585 ESTs 54 

111353 NS0430 Hs3616 ESTs 53 

111495 R07210 Hs3683 ESTs 53 

111540 R08850 Hs3786 ESTs 6 

5 111579 R10657 Hs.167115 KlAAO830protaln 12.6 

111581 R10684 HsS794 ESTs 7.1 

111734 R25375 Hs.128749 ESTs 62 

111861 R37460 H&25231 ESTs 9.4 

111870 R37778 Hs.18685 ESTs; Weakly similar to hypothetical pro 6.5 

10 111937 R40431 Hs.14848 Homo sapiens mRNA; cONA DKFZp564D016 (fr 45 

111987 R42036 Hs.6763 WAA0942 protein 6.4 

112184 R49173 H&33Q242 ESTs 53 

112286 R53765 Hs.158135 KIAA0981 protein 93 

112380 R59740 H&5740 ESTs 4.7 

15 112452 R63841 Hs.157461 ESTs 6 

112601 R79111 HS.7B225 amexinAI 5.4 

112753 R93696 Hs.169882 ESTs 53 

112902 T09262 Hs.129190 ESTs 5-1 

112984 T23457 HsJ283014 ESTs 43 

20 113021 T23855 Hs.129836 WAA1028 protein 103 

113083 T40530 H&266957 ESTs; Weakly slmSar to heat shock prate 5.7 

113200 T57773 Hs.10263 ESTs 73 

113494 T88878 Hs36538 ESTs 8.7 

113849 W60439 Hs3858 ESTs; Moderately similar to cbp146 [Mjhu 43 

25 113883 W72382 Hs.11958 oxidative 3 alpha hydroxysterold dahydro 4.7 

113350 W85765 Hs30504 Homo sapiens mRNA; cONA DKFZp434E082 (fr 6.7 

113986 W87462 Hs31894 ESTs 53 

113989 W87544 Hs368828 ESTs 4.7 

114124 Z38595 Hs.125019 ESTs; Highly similar to KIAA0886 protein 213 

30 114340 Z41395 Hs.143611 ESTs 9.6 

114346 Z41450 Hs.130489 ESTs 53 

114435 AA018216 Hs.164975 Bicaudal 0 (Drosophila) homolog 1 7.4 

114463 AA025370 Hs.40109 KIAA0872 protein 83 

114652 AA101416 Hs.107149 ESTs; Weakly similar to PTB-ASSOC1ATED S 5A 

35 114721 AA131450 Hs.103822 ESTs 43 

114730 AA133527 Hs331328 ESTs; Weakly similar to The KIAA0138 gen 5.1 

114833 AA234362 Hs37159 ESTs; Moderately similar to CGI-66prote 53 

114860 AA235112 Hs.42179 ESTs; Moderator/ similar to similar to m 63 

114884 AA235811 Hs393672 ESTs 53 

40 114895 AA236177 Hs.76591 WAA0887 protein 4.7 

114908 AA236545 Hs34973 ESTs S3 

114932 AA242751 Hs.16218 WAA0903 protein 5.7 

115084 AA255566 Hs.42484 Homo sapiens mHNA; cONA DKF2p564C053 (fr 53 

115140 AA258O30 Hs379938 ESTs; Weakly similar to supported by GEN 53 

45 115468 AA287061 H&48499 ESTs; Highly similar to Bdeight protein 4.7 

115583 AA398913 Hs.45231 L0OC1 protein 7.6 

115709 AA412519 Hs38279 ESTs 4.8 

115772 AA423972 Hs.131740 ESTs 5 

115774 AA424029 Hs388390 ESTs; Moderately similar to dynamin; int 5.4 

50 115776 AA424038 Hs31897 ESTs 5 

115821 AA427528 Hs.130965 ESTs; WeaWy sMar to ZINC RNGER PROT 117 

115955 AA446121 Hs.44198 Homo sapiens BAG done RG054O04 from 7q3 10.6 

116024 AA451748 Hs.83883 Human DNA sequence from done 718J7 on c 63 - 

116108 AA457566 Hs38777 ESTs 6 

55 116117 AA459117 Hs31575 SEC63; endoplasmic reticulum translocon 7.3 

116146 AA460701 Hs.15423 ESTs 53 

116296 AA489033 Hs.62601 Homo sapiens mRNA; cDNADKFZp586K1318(t 5.7 

116379 AA521472 Hs.71252 ESTs 53 

116393 AA599463 Hs.306051 protein phosphatase 2 (formerly 2A); rag 53 

60 116401 AA5999S3 Hs39698 ESTs 73 

116416 AA609219 Hs39982 ESTs 93 

116587 D59325 Hs.121429 ESTs 53 

116601 D80055 Hs.45140 ESTs 43 
116684 F09156 Hs.66095 ESTs 73 

65 116722 F13654 HSRH32 Stratagene cat#937212 (1992) Horn 53 

116766 H13260 Hs.95097 ESTs 53 
117453 N29568 Hs.108319 thyroid hormone receptor-associated prot 63 
117557 N33920 Hs.44532 dlublquffin 43 
117708 N45114 Hs.126280 ESTs 63 
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118001 


N52151 


H&47447 


ESTs 


11j4 


118229 


N62339 


Hs.166254 


heat shock SOU) protein 1; alpha 


62 


118599 


N69207 


Hsio3697 


ESTs 


65 


118845 


N70358 


Hs.125180 


growth hormone receptor 


7.1 


118873 


N89881 


H&44577 


ESTs 


6 


118985 


N94303 


H&55028 


ESTs 


95 


119107 


R42424 


Hs53841 


ESTs 


6 


119126 


R45175 


Hs.1 17183 


ESTs 


175 


119271 


T16387 


Hs.65328 


ESTs 


6 


119367 


T78324 


Hs_250B95 

1 liMJUWW 


ESTs 


5 


119721 


W59440 


Hs48376 


ESTs 


154 


119741 


W70205 


Hs.43670 


Mnesin family member 3A 


10.1 


119780 


W72967 


Hs.191381 


ESTs; Weakly similar to hypothefical pro 


6.3 


120217 


Z41078 


HsRS035 


ESTs 


45 


120268 


AA173939 


K&205442 


ESTs; Weakly slmflar to lunar centromere 


85 


120294 


AA1 90888 


Hs. 153881 


ESTs; Highly similar to NY-REN-62 anSge 


45 


120418 


AA236010 


H&26813 


Homo sapiens mRNA; cOMA DKFZp586F1323 (1 4.7 


1204SB 

■<VH9 


AA253400 


Hs.137569 


tumor protein 63 kDawBh strong homolog 


5.6 


120524 


AA261652 




ESTs 


45 


12(1571 




HS54892 


ESTs 


85 


i one OS 






CCTe 

tolS 


O.Z 


120713 


AA292655 


H&S6557 


COlS 




190992 




H&S7S94 


cSTS 


AO A 

164 


121429 


AA406283 




CCTe 
CO IS 


0.0 


121503 


AM12049 


HS290347 


CCTe 


7.6 


121 512 


AA412105 


Hs.103736 


ESTs 


5.8 


121B16 

ICIOlw 


AA424814 


H&4S827 


ESTS 


4.6 




/WtOlOwfi 


HeOP721 

f1MXJf£ 1 


EST; Weakly similar to N-copne [Ksapie 


5.6 


122294 


AA437311 


Hs.88927 


ESTs 


5.7 


122411 


AA446859 


Hs59083 


bo 1 S 


0.5 


122791 


AA460158 


Hs.129836 


kiaaiuko pruisin 


40 A 


122792 


AA460225 


H&99519 


CCTe 

cols 


0.1 


122969 


AM78539 


Hs.104336 


CCTe 

CoIS 


A a 
4.0 


123095 


AA485724 


HS27413 


CCTe 

CoIS 


C A 
0.4 


123100 


AA485957 


H&306219 


Homo sapiens done 25032 mRNA sequence 


c 
O 


123295 


AM95981 


H&250830 


CCTe 
COIS 


4./ 


123311 


AA49S252 


Hs.105069 


CCTe 
COIS 




123583 


AA609006 


Hs.1 11240 


CCTe 
COIS 


8.1 


123619 


AA6092Q0 




CCTe 


H.I 


123645 


AASQ9310 


Hs.188691 


ESTs 


45 


123709 


AA609651 


Hs.112742 


ESTs 


7 


1239SS 


C14333 


Hs.108327 

1 14* IWvul 


damage-specific DMA binding protein 1 (1 


5 


124178 


K45998 


Hs_97101 

I lAfOf IW I 


putative G protein-coupled receptor 


65 


124352 


N21626 


Hs.102406 


ESTs 


102 


124357 


N22401 




yw37g07.s1 Morton Fetat Cochlea Homo sap 


105 


124515 


N58172 


Hs.109370 


ESTs 


142 


124911 


R88992 


Hs.174195 


ESTs 


45 


125154 


W38419 




ESTs 


4.7 


125992 


W01626 




za36e07rl Soares fetal Over spleen INF 


5.1 


126802 


AA947601 

fVi^l WW I 


H3J97056 


ESTs 


5.1 


126812 


236290 


Hs.173933 


ESTs; WeaHy similar to NUCLEAR FACTOR 1 


4.6 


127080 


AA662913 


Hs.190173 


ESTs 


5 


127308 


AA507628 


Hs.334390 


ESTs 


4.8 - 


127370 


AI024352 


Hs.70337 


Immunoglobulin superfamHy; member 4 


4.7 


127386 


AI457411 


Hs.106728 


ESTs 


45 


127965 


AA828760 


H&292059 


ESTs 


4.8 


128172 


A1400862 


HS265130 


ESTs 


5 


128305 


AI039722 


Hs279009 


ESTs 


55 


128420 


AI088155 


Hs.41296 


ESTs; Weakly similar to unknown [Usapia 


17 


128467 


AA176446 


Hs.180428 


EST s; Weakly similar to hypothetical 43. 


45 


128610 


L38608 


Hs.10247 


activated leucocyte cell adhesion roolecu 


75 


128625 


AA242816 


Hs.102652 


ESTs; Weakly similar to WAA0437 [H .sapl 


8.1 


128651 


AA446990 


Hs.103135 


ESTs 


65 


129088 


AA215971 


Hs.194431 


K1AAQ992 protein 


52 


129136 


N26391 


Hs250723 


ESTs 


5.1 


129171 


AA234048 


Hs.7753 


calumenln 


55 


129229 


AA211941 


Hs.109643 


polyadenylate binding proteln-tntoractin 


55 


129386 


N27524 


Hs260024 


Cdc42 effector protein 3 


52 


129467 


AA410311 


H&44208 


ESTs 


5.1 
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129564 H22136 Hs.75295 fluanybte cyclase 1; soluble; alpha 3 163 

129699 AA4S8578 Hs.12017 K1AA0439 protein; homolog of yeast ublqu 95 

129821 F11019 Hs.12696 OHtactinSH3(tomatot}lrKfirrgprotefo 85 

129823 X00948 Hs.105314 relaxfn2(H2) 9.1 

5 129847 W46767 H&296178 ESTs; Weakly similar to RNA POLYMERASE 1 64 

129912 AA047344 Hs.107213 ESTs; Highly similar to NY-REN-6 antigen 65 

129858 120591 Hs.1378 armexinA3 5.1 

129977 J04076 Hs.1395 early growth response 2 (Krox-20 (Drosop 8.6 

130061 U82256 Hs.172851 arglnase; type fl 7.4 

10 130241 U78313 Hs.153203 MyoD family Inhfcltor 4.9 

130466 N21679 Hs.180059 ESTs 55 

130541 X05808 Hs511584 neurofilament; Cght polypeptide (68kD) 6.7 

130619 AM77739 Hs.12532 ESTs 64 

130925 N71935 Hs.169378 muMpIa PDZ domain protein 7.9 

IS 130938 AA013250 Hs51398 ESTs; Moderately similar to PUTATIVE GUI 6.2 

130971 H20332 Hs50t444 signal sequence receptor; gamma (transto 6.4 

131066 F09006 Hs52588 ESTs 5 

131126 F09012 Ks.181326 myotubulartn related protein 2 6.4 

131310 J02960 K&2551 adrenergic; beta*; receptor; surface 75 

20 131487 AA253220 Hs57373 Homo sapiens mRNA; cDNA DKFZp56401763 ((5.9 

131561 X59841 H&294101 pre-B-ceD leukemia transcripEon (actor 7.6 

131562 U90551 Hs.28777 H2A histone family; member L 5.1 
131579 N62922 Hs29088 ESTs 11 
131629 AA442119 H&238809 ESTs 45 

25 131682 AA428368 H&30654 ESTs 45 

131699 R6B657 Hs.90421 ESTs; Moderately sinrflar to 111! ALU SUB 65 

131795 N32724 H&32317 Sox4Dce transcriptional factor 5.6 

132053 H93381 H&38085 ESTs; Weakly similar to putative glycine 12 

132122 U65092 Hs.40403 Cbp/p30(Hnteracting transacSvator; wB 5.6 

30 132191 AA449431 H&288361 KIAA0741 gene product 8 

132256 AA608856 Hs.431 murine leukemia viral (bmi-1 ) oncogene h 55 

132482 AA429478 Hs.238126 ESTs; Highly similar to CGI-49 protein [ 6.6 

132533 AA021608 Hs.172510 ESTs 55 

132572 AA448297 H&237825 signal recognition parfide 72kD 62 

35 132581 R42266 Hs52256 ESTs; Weakly smrilar to beta-TrCPprotei 16 

132700 N47109 Hs5521 ESTs 65 

132701 AA279359 Hs55220 BCL2-associated athanogene 2 5.3 
132725 L41B87 Hs.184167 spficing factor, arginina/serine-rkir 7 75 
132783 N74897 Hs.278894 DEAD/H(Asp-GtuVUa-Asp/Wis)boxporypep 55 

40 132790 X75535 Hs.168670 peroxisomal famesytated protein 8 

132939 U76189 Hs.61152 exostoses (muifipleHike 2 52 

133142 F03321 Hs.65874 ESTs 52 

133342 U295B9 Hs.7138 cholinergic receptor; muscarinic 3 103 

133434 AA278B52 Hs50212 ESTs 55 

45 133453 M68941 Hs.73826 protein tyrosine phosphatase; non-recept 45 

133520 X74331 Hs.74519 prlmase;polypep&fe2A(58kD) U1 

133544 T33873 Hs.74624 protein tyrosine phosphatase; receptor t 4.6 

133608 013315 Hs.75207 gtyoxalasel 45 

133626 H75939 Hs.75277 Homo sapiens mRNA; cDNA DKFZp586M141 (fr 5 

50 133633 021262 Hs.75337 iwdeolarphosphoprotelnp130 65 

133797 S66431 Hs.76272 reSnoblastoma^inding protein 2 6 

133928 N34096 Hs.7766 ubiquirln-conjugating enzyme E2E 1 (homo 5.4 

134095 U47414 Hs.79069 cycftiG2 52 - 

134249 N89827 Hs50667 RALBP1 associated Eps domain containing 6.5 

55 134321 AA418230 Hs5172 ESTs 7 

134453 X70683 Hs53484 SRY (sax determining region YH>ox 4 4.7 

134542 X57025 Hs55112 Insulin-like growth factor 1 (somatomedl 7.7 

134570 U66615 Hs.172280 SWI/SNF related; matrix associated; acS 6.4 

134592 U82613 Hs.289104 Alu-bindingprotahwlfn zinc finger dom 5.4 

60 134654 W23625 Hs5739 ESTs; Weakly similar to ORF YGR200C [S.c 5 

134666 AA482319 Hs5752 pufative type II membrane protein 5.4 

134808 Z49099 Hs59718 spermine synthase 6.7 

134951 AA431480 Hs.169358 ESTs 95 

135066 X04602 Hs53913 interieukin 6 (interferon; beta 2) 5.7 

65 135155 AA358268 Hs.166556 ESTs; Moderately similar to transcriptio 45 

135411 L10333 Hs.93947 reticulonl 55 
300023 M10098 AFFX conlrot 18S ribosomal RNA 45 

300254 AW079607 Hs55610 ESTs; Weakly similar to ZnT-3 (H sapiens 75 

300273 AW013907 Hs.167531 ESTs; Moderately similar to predicted us 115 
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300319 AW1S7648 Hs.153506 ESTs; Weakly similar to rrtaotubute-acll 85 

300568 H86709 Hs326392 sonol68venless{Drosophlla)homolog 1 53 

300578 AB89417 Hs.134289 ESTs 44 

300671 AE397D6 Hs.93810 ESTs 73 

5 300675 AA039352 Hs.125034 ESTs; Weakly sWar to ORFYDUMOc[Sx 4.5 

300680 AW468068 Hs24817 ESTs; Weakly similar to WAA0988 protein 52 

300762 A1497778 H&20509 ESTs 84 

300810 AI076890 Hs.146847 ESTs 53 

300813 AA406411 H&208341 ESTs; Weakly similar to WAA0989 protein 10.6 

10 300823 AB63088 Hs.106823 ESTs; Weakly strrlar to putative zinc fi 5.6 

300834 AF109300 Hs.147924 ESTs 6.7 

300923 AW136372 Hs.1852 ESTs 7.6 

300962 AA593373 Hs293744 ESTs 55 

301015 AA947682 H820252 . ESTs; Weakly similar to Chain A; Cdc42hs 7 

IS 301042 AI659131 Hs.197733 ESTs 243 

301242 AW161535 HS23782 ESTs 113 

301254 AKJ49624 Hs283390 EST cluster (not in UniGene) with exon h 43 

301262 H29500 Hs.7130 ESTs; Moderately similar to N-coplns [H. 43 

301388 AA156879 Hs262036 ESTs; Weakly stater to ZINC FINGER PROT 65 

20 301563 AI802946 Hs.44208 ESTs; Weakly similar to match to ESTs AA 5.7 

301656 AW008475 Hs.151258 EST duster (not In UnKSene) with exon h 6.8 

301689 Z44810 Hs301789 ESTs; Weakly similar to similar to Cele 63 

301783 AL046347 Hs33937 Homo sapiens PAC done DJ1 159004 from 7p 62 

301805 AI800004 H&142646 ESTs; Weakly slrrlar to MesPI [Mmusculu 85 

25 301846 R20002 H&6823 ESTs; Weakly stater to intrinsic factor 4.6 

301891 AF1318S5 Hs278591 Homo sapiens done 25056 mRNA sequence 63 

302005 AI869666 Hs.123119 ESTs 363 

302056 AI457532 Hs30488 ESTs; Moderately slmDar to ROSA26AS[M. 95. 

302067 H05698 H&222399 ESTs; WeaMy similar to protein-tyroslne 53 

30 302099 ALQ21397 Hs.137576 rtbosomal protein 134 pseudoganel 83 

302147 AB022660 Hs.151717 KIAA0437 protein 5.9 

302214 AJ001454 Hs.159425 Homo sapiens mRNA tor testkan-3 43 

302236 AI128506 Hs3557 zincGngerprotein161 43 

302358 D81150 Hs322848 EST cluster (not in UniGena) with exon h 53 

35 302410 NMJXM917 Hs218366 EST duster (not in UniGene) with exon h 263 

302486 AC003682 Hs.183512 multiple UniGene matches 82 

302582 NMJM0522 Hs249195 EST cluster (not in UniGene) with exon h 64 

302785 AA425562 Hs.11065 EST duster (not in UniGene) wBh exon h 5 

302792 AA3436S6 Hs.46821 ESTs; Weakly similar to putative [H.sapi 43 

40 302881 AA508353 Hs.105314 reiaxln 1 (HI) 733 

302892 N58545 Hs42346 hbtonedeacetylase3 85 

302970 AW1 18352 Hs312679 EST cluster (not in UniGene) with exon h 74 

302977 AW263124 Hs.315111 EST cluster (not In UniGene) wBh exon h 53 

303029 AF199613 EST cluster (not In UniGene) with exon h 4.6 

45 303125 AF161352 Hs.111782 EST duster (not in UniGene) with exon h 53 

303280 A1571580 Ks.170307 ESTs 43 

303306 AA215297 H&61441 EST duster (not In UniGene) with exon h 6.4 

303309 AL134164 Hs.145416 ESTs 63 

303344 AA255977 H&250648 ESTs;HigJ^sirnibrtoublqut6n-conjug 195 

50 303380 AA298471 HS326567 EST cluster (not in UniGene) with exon h 6.6 

303401 AA75S552 Hs309497 ESTs 63 

303525 AW516519 Hs273294 ESTs 43 

303526 AA348111 Hs.96900 ESTs 12.1 - 
303540 AA355607 Hs309490 ESTs; Weakly stater to MMSET type I [H. 82 

55 303572 AW338520 Hs242540 ESTs 8.4 

303685 AW500106 Hs23643 EST cluster (not in UniGene) with exon h 4.9 

303699 D30891 Hs.19525 EST cluster (not in UniGene) with exon h 15.7 

303702 AW500748 Hs224961 ESTs; Weakly similar to 73 kDA subunit o 63 

303718 AI741397 Hs.114658 ESTs 4.6 

60 303722 AA521510 Hs.145010 ESTs 125 

303732 AW502405 Hs.125759 ESTs; Wealdy stater to tumor suppressor 43 

303735 AA707750 Hs.169055 ESTs; Weakly similar to ds-Golgl matrix 54 

303752 AI017286 Hs3957 EST duster (not in UniGene) with exon h 53 

303753 AW503733 Hs3414 ESTs 13 
65 303813 AI275850 Hs.114658 EST duster (not In UniGene) wifli exon h 73 

304053 R0O493 Hs.125565 translooasa of inner mitochondrial membr 43 

304218 N66373 Hs27973 ESTs; Weakly similar to 2X354.7 (Celega 6 

305200 AA668128 Hs.45207 EST singleton (not in UniGene) wllh exon 5.7 

306716 AI024916 Hs251354 ESTs 5.7 
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307848 AB64186 


EST singleton (not in UniGene) wflh extm 


73 


307071 M368665 


H&31476 EST singleton (not h UniQane) with exon 


84 


308050 A1460004 


H&31608 EST singleton (not In UniQenejwift exon 


8.1 


308362 AI613518 


Hs.105749 EST singleton (not tn UniGene) wSh exon 


53 


308923 AI863051 


Hs379815 ESTs 


4.4 


309116 AI927149 


H&29797 ribosamal protein L10 


4.5 


309375 AW075342 


Hs.9271 EST singleton (not In UnlGena) with exon 


7.4 


309674 AW205604 


Hs.266009 ESTs; Weakly similar to Bll ALU SUSFAMI 


5 


310095 AI921750 


Hs.144871 ESTs 


5 


310098 AI685841 


Hs.161354 ESTs 


11j6 


310250 AM78629 


Hs.158465 ESTs 


S.8 


310365 AI262148 


Hs.145569 ESTs 


9.7 


310382 AI734009 


Hs.1 27699 EST duster (not In UnlGena) 


104 


310409 A161Z775 


Hs.145710 ESTs 


43 


310431 AI420227 


H&149358 ESTs 


725 


310573 AW282180 


Hs.156142 ESTs 


73 


310598 AI338013 


Hs.140546 ESTs 


92 


310639 AW269082 


Hs.175162 ESTs 


43 


310787 AW262580 


Hs.147674 ESTs 


43 


310816 AI973051 


Hs324965 ESTs 


73 


311251 AI655662 


Hs.197698 ESTs 


413 


311280 AI767957 


Hs.198248 ESTs; Weakly similar to Y38AB.1 gene pro 


43 


311330 A1679524 


Hs201629 ESTsiMaJeratetysimllartotmAWSUB 


4.6 


311515 AW136713 


H&23862 ESTs 


53 


311574 AB24863 


H&211420 ESTs 


4.8 


311587 AI828254 


HS271019 ESTs 


5.8 


311598 AI682088 


Hs.79375 ESTs 


264 


311631 AI809519 


H&27133 ESTs 


6.4 


311688 AW025661 


' H&240090 ESTs 


74 


311783 AI682478 


Hs.1 3528 EST 


4.6 


311826 AA765470 


Hs.85092 ESTs 


6.7 


311853 AW014013 


Hs.107056 ESTs 


53 


311901 R16890 


Hs.137135 ESTs 


5.6 


311932 AW451654 


Hs357482 ESTs 


43 


312153 AA759250 


Hs.1 18625 cytodiromeb-561 


11 


312182 AA834800 


Hs326263 EST duster (not in UniGene) 


165 


312242 AI380207 


Hs.125276 ESTs 


4.7 


312296 C01367 


Hs.127128 ESTs 


53 


312407 R46180 


Hs.153485 ESTs 


62 


312424 AA847398 


Hs391997 ESTs 


4.8 


312425 R49353 


H&293892 ESTs 


S2 


312480 R68651 


Hs.144997 ESTs 


95 


312518 C17785 


Hs.182738 ESTs 


63 


312521 AA033609 


Hs.239884 ESTs 


112 


312527 AI695S22 


Hs.191271 ESTs 


4.7 


312539 AI004377 


Hs.200360 ESTs 


7 


312546 A1623511 


Hs.118567 ESTs 


5.1 


312563 AA976064 


Hs.180842 ESTs 


85 


312623 AA694607 


Hs.176956 EST cluster (not in UniGene) 


103 


312857 AA772279 


Hs.126914 ESTs 


5 


312890 AI813654 


HS-5957 ESTs 


53 


312903 AA939266 


H&27B626 ESTs 


7.7 


312905 H92571 


H&234478 ESTs 


6jS . 


312976 AA836271 


fls.125830 ESTs 


4.6 


312983 A1079278 


Hs.269899 ESTs 


5.1 


312996 AA249018 


Hs.154331 EST duster (not in UniGene) 


7 


313035 N36417 


Hs.144928 ESTs 


63 


313166 AI801098 


Hs.151500 ESTs 


43 


313188 AI039702 


Hs.179573 collagen; type I; alpha 2 


4.8 


313218 AA827805 


Hs.124298 ESTs 


5 


313226 AE00281 


Hs.123910 ESTs 


53 


313325 AI420611 


Hs.127832 ESTs 


43 


313326 A1088120 


Hs.122329 ESTs 


7.4 


313425 AA745689 


Hs.186838 ESTs; Weakly sbnOar to similar to zinc 


63 


313499 AI261390 


Hs.146085 ESTs ' 


5.6 


313540 AI797301 


Hs5740 ESTs 


53 


313568 AW467376 


Hs.129640 ESTs 


43 


313569 AI273419 


Hs.135146 ESTs;WeaHys!maartoZK1058£[C.eleg 


4.6 


313603 AW468119 


H&287631 EST duster (hot in UniGene) 


63 
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313815 AW235194 H&301W7 DKFZP434N126 protein 52 

3136% AW468402 H&254020 ESTs 7£ 

313634 AA6S8292 H&S3778S ESTs U 

313635 AA507227 HSJ6390 ESTs 8.1 
5 313638 AI75307S Hs.104627 ESTs 6.7 

313870 C16690 Hs23767 EST duster (not In UnlQene) 4.4 

313871 W4S823 Hs.104613 ESTs 4A 
313676 AA861697 Hs.120591 EST duster (not In UniGene) 13.4 
313703 AI161283 H&280380 ESTs; Weakly similar Id K1AA052S protein 10 

10 313712 AA768553 Hs.74170 ESTs 52 

313800 AW296132 Hi55098 ESTs 5.4 

313979 A1S3S895 H&221024 ESTs 4.3 

314121 AI732100 Hs.187619 ESTs 13£ 

314123 AW245983 H&223394 ESTs 6.4 

IS 314171 AB21895 Hs.193481 ESTs 294 

314188 AL138431 Hs.164243 ESTs 4.6 

314219 AL036001 Ha48376 ESTs 5.7 

314236 AA743396 Hs.189023 ESTs 4.9 

314237 AA732359 H&86264 ESTs 4.4 
20 314284 AA731431 Hs293464 EST duster (not in UniGene) 5A 

314305 AI280112 Hs.125232 ESTs 52 

314343 AI754701 Hs328478 ESTs; WeaMy sWbr to aBemalH/ety sp 62 

314530 A10S2358 Hs.193728 ESTs 45 

314691 AW207206 Hs.136319 ESTs 17 

25 314695 AVJ502698 Hs.118152 ESTs 8.9 

314785 AIS38228 Hsi32976 ESTs • 9.4 

314601 AA481027 Hs.109045 ESTs; WeaMy similar to ORF YGR245c [S.c 8 

314864 AA493811 Ha29406B ESTs 6 

314907 AI672225 H&222888 ESTs 19.3 

30 314916 AA548806 Hs.122244 ESTs 45 

314954 AA521381 Hs.187728 ESTs 52 

3t4981 AA524953 HsJ293334 ESTs 4.6 

315021 AA533447 Hs512989 EST cluster (not in UniGene) 5.1 

315051 AW282425 Hs.163484 EST \55 

35 315052 AA876910 Hs.134427 ESTs 20 

315073 AW452948 H&257631 ESTs S3 

315084 A1821085 ESTs 82 

315214 AI915927 HS34771 ESTs 5A 

315220 AI420753 Hs£S731 ESTs 5.1 

40 315278 AI985544 Hs.12450 ESTs SB 

315282 AE22165 Hs.144923 ESTs 45 

315368 AW291563 Hs.104696 ESTs 6 

315369 AA764916 K0256531 ESTs 42 
315378 AK63393 Hs.145008 ESTs 62 

45 315379 AI378329 Hs.126629 ESTs 5.4 

315402 AW293424 Hs.75354 ESTs 5.1 

315442 AA977935 Hs.127274 ESTs 6.6 

315443 AW003416 Hs.160604 ESTs SS 
315528 B37257 Hs.184780 ESTs 8.1 

50 315593 AW198103 Hs.158154 ESTs 9.9 

315634 AA837085 rfc220585 ESTs 72 

315705 AW449285 HsJ13636 ESTs 8.9 

315707 AM18055 Hs.181160 ESTs 5.1 - 

315714 AA744015 H&298138 EST duster (not in UniGene) 6.1 

55 315740 T05558 Hs.156880 EST duster (not to UniGene) 62 

315762 A1391470 Hs.158618 ESTs 52 

315769 AA744B75 Hs.189413 ESTs 5 

315843 AA679430 Hs.191697 ESTs 5.7 

315990 AI800041 Hs.190555 ESTs 92 

60 316012 AA764950 Hs.119898 ESTs 42 

316036 AA708016 Hs.190389 ESTs 53 

316055 AA6938B0 Hs.6947 EST duster (not in UniGene) 6.7 

316074 AW517542 Hs293273 ESTs 52 

316100 AW203986 HsJ213003 ESTs 5.1 

65 316169 AI127483 Hs.120451 ESTs 62 

316442 AA760894 Hs.153023 ESTs 17.1 

316491 AA766025 Hs.186854 EST 42 

316504 AW1S5854 Hs.132458 ESTs 42 

316667 AW015940 H&232234 ESTs 72 
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316854 AA83121S 


Ks.159068 ESTsWeaHy similar to predicted using 


5.1 


316905 AW138241 


Hs310846 ESTs 


64 


317003 AWQ51S97 


Hs.143707 ESTs 


44 


317019 AA864968 


Hs.127699 ESTs 


11 


317194 AW44S167 


Ks.126036 ESTs 


13.5 


317224 056760 


Hs33029 ESTs 


8.7 


317404 AI806867 


Hs.126594 ESTs 


8.7 


317501 AA931245 


Hs.137037 ESTs 


11.1 


317543 AI654187 


Hs.195704 ESTs 


142 


317651 AW2S2779 


Hs.169789 ESTs 


53 


317758 AI733277 


Hs.128321 ESTs 


54 


317850 N2S974 


Hs.152982 ESTdustar(notlnUnl3ene) 


114 


317869 AW295184 


Hs.129142 ESTs; Weakly similar to 0EOXYRIBONUCLEAS 133 


317802 A1828602 


H&211265 ESTs 


53 


317916 AI565071 


Hs.159983 ESTs 


7.7 


318239 AI085198 


Hs.164226 ESTS 


13.1 


318268 AM17736 


Hs.182490 ESTs 


63 


318327 AW2S4013 


Hsio0942 ESTs 


43 


318353 R45530 


Hs.1440 garrima-amlnotertyricadd(GABA)Arec8pto 


6 


318428 AI949409 


Hs.194591 ESTs 


123 


318464 A1151010 


Hs.157774 ESTs 


43 


318524 AW291511 


Hs.159066 ESTs 


253 


318540 T30280 


H&274803 EST duster (not In UniGene) 


7 


318591 AW206806 


Hs.1 15325 ESTs 


43 


318615 A1133617 


Hs.10177 ESTs 


63 


318646 AW175665 


H&278695 ESTs 


5.7 


318667 AI493742 


Hs.165210 ESTs 


11 


318668 W26276 


Hs.136075 ESTs 


53 


318753 AA578265 


Hs.7130 copinaW 


53 


313080 Z45131 


H&23023 ESTs 


163 


319181 F06504 


H&27384 EST cluster (not In UniGene) 


4.8 


319191 ARJ71538 


Hs.79414 prostata epittetiunvspacific Ets transcr 


63 


319233 R21054 


Hs.180532 ESTs 


43 


319586 078803 


Hs2B3683 ESTs 


83 


319750 AA621606 


Hs.1 17956 ESTs 


93 


319763 AA460775 


H&6295 ESTs 


143 


319824 AA424266 


Hs.123642 EST duster (not in UniGene) 


123 


319838 AA337642 


H&95262 nuclear (actor related to kappa 8 bincfai 


5.1 


319913 AA179304 


H&271586 ESTs; Moderately similar to DH ALU SUB 


43 


319964 T80579 


Hs39Q270 ESTs 


53 


320076 A1653733 


H&271593 ESTs 


83 


320102 AW298219 


Hs.1 15325 RAB7; member RAS oncogene femSy-lite 1 


93 


320187 T99949 


Hs303428 EST duster (not in UniGene) 


93 


320211 AL039402 


Hs.125783 DEME-6protBln 


73 


320324 AF071202 


Hs.139336 ATP-binding cassette; suWamlly C (CFTR 


562 


320455 R498B9 


H&24144 ESTdustsr(notlnUnlGene) 


8.3 


320464 AI039817 


H&237146 ESTs 


54 


320561 NM_0OS953 


Hs.159330 EST duster (not In UniGene) 


7 


320574 AUM9443 


Hs.161283 Homo sapiens mBNA; cONA DKFZp586N2020 ((44 


320576 AL049977 


Hs.162209 Homo sapiens mRNA; cDNA DKFZp564C122 (fr 6.7 


320654 AW263086 


Hs.118112 ESTs 


6 


320796 AF038966 


H&31218 secretory carrier membrane protein 1 


133 


320800 A1681006 


Hs.71721 ESTs 


62 


320813 AW360847 


Hs.16578 ESTs 


93 ' 


320853 AI4737S6 


Hs.135904 ESTs 


8.1 


320856 059945 


Hs35366 EST cluster (not in UniGene) 


6 


320899 AA633772 


Hs.1 16798 ESTs 


92 


320918 AW195012 


HS293970 ESTs 


5 


320973 H19732 


HS247917 ESTs 


5.9 


321099 AA018386 


Hs.64341 ESTs 


4.6 


321190 H52462 


Hs.163872 EST cluster (not In UniGene) 


53 


321318 AB033O41 


Hs.137507 EST duster (not In UniGene) 


8.4 


321382 AW372449 


Hs.175982 EST duster (not In UniGene) 


73 


321441 AW297633 


Hs.1 18498 ESTs 


14.7 


321539 H80483 


Hs.46903 EST cluster (not In UniGene) 


92 


321609 H86021 


Hs.182538 ESTs;WealdyslmttartohMmTRA1b(asapl 


4.8 


321636 AI761838 


Hs.193465 ESTs 


53 


321638 A1356352 


Hs.108932 ESTs 


43 


321644 A1204177 


Hs237396 ESTs 


63 
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321681 


AA233821 . 


Hs.190173 


EST cluster (not in UniGene) 


4.6 


321728 


X91221 


Hs.144465 


EST duster (not In UniGene) 


5 


321758 


U29112 


Hs.196151 


EST duster (not In UniGene) 


02 


321877 


AL109784 


Hs.189222 


EST cluster (not In UniGene) 


4.6 


321899 


N55158 


KS29468 


ESTs 


4.6 


321902 


AA746374 


Ha.145010 


ESTs 


82 


322007 


AW410646 


H&164649 


ESTs 


5.1 


322055 


AL137646 


Hs.146001 


EST duster (not in UniGsne) 


4.3 


322092 


AF085833 


Hs.135624 


EST duster (not In UniQana) 


43 


322221 


AI890619 


Hs.179662 


nudeosame assembly protein 1-Bke 1 


4A 


322278 


AF086283 




EST cluster (not to UniGene) 


S3 


322303 


W07459 


Hs.157601 


EST duster (hot In UniGene) 


22 


322437 


AW393804 


Hs.170253 


ESTs; Weakly similar to rabaptin-4 (Rsa 


4A 


322493 


AF143235 


H&278819 


EST cluster (not to UniGene) 


72 


322782 


AAQ6S060 


H&202577 


EST duster (not In UniGene) 


18.4 


322811 


AA782292 


Hs.105872 


ESTs 


09 


322818 


AWD43782 


H&293616 


ESTs 


10.7 


322826 


AI807883 


Hs.180059 


ESTs 


5 


322887 


AI986306 


H&86149 


ESTs; Weakly similar to KIAA0969 protein 


113 


322889 


AA081924 


Hs.124918 


ESTs 


7.1 


322924 


AA669253 


Hs.136075 


ESTs 


45 


322982 


AI351191 


Hs.128430 


ESTs 


6.6 


322994 


AA422116 


Hs.191461 


ESTs 


4.7 


323040 


AA336609 


Hs.10862 


ESTs 


6.9 


323041 


AL1 18747 


H&26691 


EST duster (not in UniGene) 


83 


323045 


AA148950 


Hs.188836 


ESTs 


4.6 


323048 


AL1 18923 


Hs.175110 


EST duster (not In UniGene) 


75 


323070 


AA157726 


H&264330 


ESTs 


75 


323071 


AA1 57867 


H&5722 


ESTs 


4.7 


323097 


Z44354 


Hs.296261 


guanine nudaotide binding protein (G pr 


AS 


323131 


AA176982 


H&270124 


EST duster (not In UniGene) 


6.1 


323136 


AL120351 


Hs30177 


EST duster (not in UniGene) 


4.3 


323175 


AI827137 


Hs.336454 


ESTs 


6.2 


323218 


AF131846 


Hs.13396 


Homo sapiens done 25028 mRNA sequence 


6.3 


323226 


AF055019 


H&21906 


Homo sapiens dona 24670 mRNA sequence 


12.6 


323236 


AA363148 


H&293960 


ESTs 


10.9 


323262 


Al 829770 


Hs.190642 


ESTs 


7.6 


323276 


AA836452 


HS323822 


ESTs 


73 


323287 


AA639932 


Hs.104215 


ESTs 


24.7 


323335 


AI655499 


Hs.161712 


ESTs 


14.1 


323341 


AL134875 


Hs.108646 


ESTs 


53 


323362 


AL135067 


Hs.1 17182 


ESTs 


6.1 


323486 


C05278 


H&299221 


ESTs; Moderately similar to [PYRUVATE DE 


8.5 


323496 


AI826801 


Hs300700 


ESTs 


45 


323507 


H71721 


Hs.128387 


ESTs 


4A 


323545 


AI814405 


H&224569 


ESTs 


S3 


323623 


AA314280 


Hs.146589 


EST duster (not in UniGene) 


5 


323663 


AW263526 


H&243Q23 


ESTs 


7.7 


323691 


AA317561 


Hs.145599 


EST duster (not In UniGene) 


S3 


323810 


AA740405 


Hs.108806 


ESTs 


62 


323846 


AA337621 


Hs.137635 


ESTs 


6 


323929 


AA354940 


Hs.145958 


ESTs 


10.7 


323959 


AI63S775 


Hs.6831 


ESTs 


5.4 


323996 


AA367032 


H&217882 


ESTs 


S3 


323997 


AA844907 


H&274454 


EST duster (not in UniGene) 


4.4 


324019 


AW177009 




EST duster (not in UniGene) 


4.6 


324130 


AUM6575 


Hs.130198 


ESTs 


11 


324295 


AI146686 


Hs.143691 


ESTs 


13.7 


324296 


AI524039 


Ks.192524 


ESTs 


63 


324307 


AA627642 


Hs.4994 


transducer of ERBB2; 2 (TOB2) 


43 


324330 


AA884766 




EST duster (not in UniGene) 


43 


324385 


F28212 


H&284247 


EST duster (not In UniGene) 


4.7 


324430 


AA464016 


Hs.184598 


EST duster (not in UniGene) 


116 


324452 


AW014022 


Hs.170953 


ESTs 


7.6 


324547 


AW501974 


Hs.74170 


ESTs 


5.6 


324603 


AW016378 


H&292934 


ESTs 


242 


324617 


AA506552 


Hs.195839 


ESTs 


54 


324618 


AB46282 


K&87159 


ESTs 


4.6 


324620 


AA448021 


Hs.94109 


EST cluster (not in UniGene) 


5.7 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



324626 AI68S464 
324658 AIE94767 . 
324676 AW503943 
324691 AK17983 
324696 AA341092 
324713 AW34Q249 
324715 AI739168 
324718 AI557019 
324720 AA578904 

324752 AI27B919 

324753 AA612626 
324790 AI334367 
324801 AI819924 
324804 AI692552 
324845 AA361016 
324888 AI5S4134 
324929 AI741633 
324981 AA613792 
325108 AA401863 



326997 
327098 
328492 
329362 
329929 



330020 
330211 

330384 M23263 
330430 HG2261-HT2352 
330546 U31382 
330551 U39840 
330658 AA319514 
330700 AA037415 
330704 AA056557 
330706 AA102571 
330708 AA121140 
330712 AA167269 
330725 AA252033 
330732 AA281092 

330762 AA449677 

330763 AA450200 
330772 AA479114 
330786 DS0374 
330892 AA149579 
330949 H01458 
330977 H20826 
331017 N24619 
331099 R36671 
331128 R51361 
331151 R82331 
331195 T64447 

331320 AA262999 

331321 AA278355 
331337 AA287662 
331348 AA400596 
331359 AA416979 
331383 AM54543 
331422 F1O302 
331442 H77381 
331468 N21680 
331479 N27154 
331490 N32912 
331493 N34357 
331561 N62780 
331615 N92352 
331659 W48868 
331698 Z38907 
331811 AA4O4SO0 



ESTs 

Hs.129179 ESTs 
Hs.1 12451 ESTs 

HSJ293341 ESTs; Weakly similar to Pro-a2(Xl) [Usa 
H&257339 ESTs 
Hs.183440 ESTs 

Hs.131798 EST cluster (not in UniGene) 
Hs.116467 ESTs 
H&282437 ESTs 

HS272072 ESTs; Moderately similar to lill ALU SUB 
Hs.144871 EST duster (not In UniQene) 
Hs.159337 ESTs 
Hs.14553 ESTs 
ESTs 

Hs337533 ESTs 
Hs.136102 KLAA0853 protein 
Hs.125350 ESTs 

EST duster (not in UnlGene) 
H&22380 ESTs 

CH20Jtsgi|6552458 
CH21JisglI5867660 
CH21Jisgl6682516 
CR07_hsgl|5868455 
CaXJisgil5888837 
CH.16j2gi|6165201 
CH.16_p2gi]5091594 
OL18_p2 glj6671887 
Ca05_p2gi|6013592 
androgen receptor (dihydrotestosterone r 
Hs321110 

Hs299867 guanine nucleotide binding protein 4 

hepatocyte nuclear factor 3; alpha 
Hs30732 ESTs 
Hs.20999 ESTs 
Hs.6759 ESTs 
Hs.157078 ESTs 

Hs.177576 ESTs; Moderately arrflar to kynurenine a 
Hs52620 ESTs 

Hs.24052 ESTs; Weakly similar to Iffl ALU SUBFAMI 
Hs35254 ESTs 

Hs.15251 Human DNA sequence from clone 437M21 on 
Hs.143187 FK506-blndir>g protein 3 (25kD) 
Hs.1 1356 ESTs 
EST 

Hs31202 ESTs 
Hs.142896 ESTs 
Hs315181 ESTs 
Hs.108920 ESTs 
Hs.14846 ESTs 
Ks268714 ESTs 
H&268838 ESTs 
Hs.168439 ESTs 
Hs300141 ESTs 
Hs37929 ESTs 
Hs.1 18630 ESTs 
Hs.88143 ESTs 
Hs31897 ESTs 
Hs.43543 ESTs 

Hs.237339 ESTs; Moderately similar to till ALU SUB 
Hs.41223 ESTs 
Hs.43455 ESTs 
Hs.44076 ESTs 

H&291039 ESTs; Weakly similar to hypothetical 43. 
Hs.93817 ESTs 
Hs.48703 ESTs 
HSS472 ESTs 
Hs334305 ESTs 
Hs35949 K1AA08B8 protein 
Hs.187958 ESTs 



9 




22 




4.9 




10.6 




102 




53 




72 




344 




43 




7.9 




62 




73 




123 




63 




43 








63 




5.1 




7.1 




9.6 




43 




A3 




5.8 




43 




53 




7.6 




g 




12j6 




g 




Antigen, Prostate Spe 


cffic, At SpDce 


6 




4.9 




6 




55 




5.1 




11.7 




143 




5 




72 




AS 




183 




A3 




53 




4.6 




153 




105 




4.4 




113 




113 




43 




13 




43 




4.8 - 




6.1 




92 




9.9 




43 




4.6 




43 




73 




5.4 




63 




123 




4.6 




92 




43 




8.7 




103 




43 





133 
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331848 AM17039 Hs.98268 signal recognition particle 72kD 73 

331873 AA429445 Hs£8M0 ESTs 63 

331889 AA431407 Hs38802 Homo sapiens Chromosome 16 BAG dona CfT 333 

331S67 AA460158 H&89589 K1AA1028 protein 63 

5 331874 AA4S4518 Hs.105322 ESTs 53 

332043 AA490831 H&201591 ESTs 103 

332076 AAS99477 H&291156 ESTs 4.4 

332173 RB281 Hs.100725 ESTs 53 

332247 N58172 ESTs 142 

10 332249 N620S8 Hs.194140 ESTs 72 

332325 179428 Hs339687 ESTs 53 

332396 AA340504 ESTs; WeaJdystaBar to simDarto human 212 

332434 N75542 Hs237731 transcription {actor 4 153 

332493 N 95495 Hs36729 ESTs; Highly simEar to GTP-bhding prat 7.1 

IS 332522 138503 Hs.178357 glutathione S-transferase theta 2 6.6 

332526 AA281753 Hs.17731 inositol 1 ^^triphosphate receptor; ty 5.8 

332530 M31682 Hs.19280 tnhtbtn; beta B (acSvtn AB beta polypep 53 

332533 M99487 Hs.Sfi825 tolatohydrolasa^rostata-spedficmemb 38.1 

332538 N48715 Hs2Q991 ESTs 63 

20 332546 D84454 Hs22587 solute carrier family 35 (UDP-gaiactosa 43 

332594 AA279313 H&32951 msthylCpQ binding protein 2 53 

332610 AA412405 Hs.40513 ESTs; WeaktystmBar to BETA GALACTOSIDA 5.6 

332661 K95742 Hsj6390 ESTs 63 

332697 T84885 Hs.75725 carboxypeptMaseE 243 

25 332712 D26070 Hs.79306 Inositol 1^-^-triphospnata receptor; ty 93 

332716 L00058 HsJ8630 v-myc avian myebcytomatosls viral oncog 5.6 

332726 R72029 HsX3428 synaptophysMke protein 5 

332781 AA233258 ESTs; Weakly similar to D10073 [Cetega 43 

332797 CH22_FGENES.6 Jt 303 

30 332798 CH22_FGENES3J5 663 

332799 CH22_FGENES3_6 193 

332933 CH22_R5ENES38_7 5.6 

332980 CH22J=GENES34_1 53 

332984 CH22_FGENES34_6 4.9 

35 333168 CH22_FGENES34_1 4.7 

333169 CH22.FGENES.94_2 4.4 

333452 CH22_FGENES.157_1 43 

333456 CH22_FGENES.157_5 43 

333458 CH22_FGENES.157_7 4.6 

40 333611 CH22_FGENES217_6 4.7 

333621 CH22J=GENES219_5 55 

333814 CH22_FGENES282_2 7.1 

333849 CH22_FGENES290JB 62 

333949 CH22_FGENES303_5 43 

45 333951 CH22_FGENES303_7 43 

333955 CH22J=GENES303_11 5.8 

334150 CH22_raENES339_1 5.1 

334223 CH22JFGENES360_4 203 

334297 CH22_FGENES372_3 9.4 

50 334443 CH22_FGENES387_2 4.6 

334444 CH22_FGENES387_4 5.6 

334447 CH22_FGENES3B7_7 111 

334570 CH22_FGENES.405_11 5.4 - 

334749 CH22_FGENES.427_1 S3 

55 334777 CH22_FGENES.430_9 4.7 

334960 CH22_FGENES.465_29 52 

335179 CH22_FGENES304J9 83 

335293 CH22_FGENES327_6 4.7 

335550 CH22_FGENES576_11 5.1 

60 335581 CH22_FGENES381_19 5.7 

335586 CH22_FGENES581_25 43 

335809 CH22_FGBES.617_6 62 

335810 CH22_FGENES.617_7 5.8 
335822 CH22_FGENES319_7 7.1 

65 335824 CH22J=GENES.619_11 83 

335853 CH22_R3ENES.626_5 43 

335886 CH22 FGENES332J 43 

336034 CH22J=GENES.678_5 63 

336441 CH22_FGENES327_7 73 
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338624 


CH22_FGENES£3 


433 


336625 


. CH22_FGENES.64 




336679 


CH22_FGENE&43-7 


S3 


337577 


CH22_C65E1.GENSCAN.8-1 


AS 


338255 


CH22_EM^C005500.GENSCAN276-3 


114 


338260 


CH22_EMJUX»5500£ENSCANJ279-10 


43 


338581 


CH22_EM^C005500.GENSCAR421-5 


43 


338562 


CH22JM^CC05500.GENS(m421-6 


4.3 


338759 


CH22_EMAC0O55OO.GENSCAIi517-6 


5.1 


338763 


CH22_EM^C0O550O.GENSCAH517-16 


53 


338764 


CH22_EM:AC005500.GENSCAN.S17-17 


7.1 
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TABLE 3A shows the accession numbers for those primekeys lacking unigeneDD's for Table 
3. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

10 Ptey: Unique Eos probeset Identifier nunte 

CAT number Gene duster number 

Accession: Genbank accession numbers 

IS Ptey CAT number Accession 

123619 371681 1 AA602984AA609200 

116722 143512 J Z24878 AA494098F13654AA494040AA143127 

103677 41847 1 ZB3806 AJ132091 AJ132090 

20 125992 1589048 1 H48372W01626 

109342 genbank_AA213620 AA213620 

125154 genbank_W38419 W38419 

101447 entnx_M21305 M21305 

124357 genbank_N22401 N22401 
25 108910 g8nbajiK_AA136590 AA136590 

322278 47271 1 W69304 AF086283 W69200 

315084 350959 1 AI821085 AW973464 AA554802 AI821831 AA657438 AA640756 AA650339 

324019 262792J AW177009AI381610 
324330 300543 1 AA884766 AW974271 AA592975 AA447312 

30 324626 336411 1 AI685464 AW971336 AA513587 AA525142 

303029 37699 1 AF199613 AF108756 

324804 398093 1 AI692552 AI393343 AB00510 AI37771 1 F24263 AA661876 

324961 376239 1 AA613792 AW182329 T05304 AW8S8385 

329362 cjUb 
35 338624 CH22_4071FG_6_3_ 

336625 CH22_4072FG_6_4_ 

336679 CH22_4157FG_43_7_ 

338255 CH22_6856FG_JJNK_EM:AC00 

338260 CH22_6863R3_UNK_EMACO0 
40 329929 Cl6_p2 

329960 c16_p2 

338561 CH22_7294FG_JUNK_EMAC00 

338562 CH22_7295FG_UNK_EMAC00 
338759 CH22_7581FG_JJNK_EMACO0 

45 338763 CH22_7585FG_UNKJEMAC00 
338764 CH22_7586FG_UNK_EMAC00 

333168 CH22 400FQ_94_1JJNK_EMA 

333169 CH22_401 FG_94 JJJNK_EM A 
333452 CH22_702FG_157_1_LWK_EM: 

50 333456 CH22_706FG_157_5_LINK_EM: 
' 333458 CH22_708FG_157_7_UMK_EM: 

333611 CH22_872FG_217_6JJNK_EM: 

333621 CH22_882FG_219_5JUNieEM: 

333814 CH22_1083FG_282_2JJNK_EM 
55 ' 333849 CH22_1118FG_290_8J.INK_EM 

335179 CH22_2515FG_504_9JJNK_EM 

333949 CH22 1225FG_303J5JUNK_EM 

333951 CH22_1227FG_303_7J.INK_EM 

333955 CH22 1231FG_303_11_UNK_E 
60 335293 CH22J635FG_527_6JJNK_EM 

326816 c20Jis 

326997 C21_hs 

335550 CH22_2905FGL576_11JJNK_E 
335581 CH22_2938FG_581_19JJNrLE 
65 335586 CH22J944FG_581J5JJNIC£ 
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328492 c_7jis 

335809 CH22_3181FQ_617_6JLINK_EM 

335810 CH22_3182FQ_617_7JJNK_EM 
335822 CH22_3195FQ_619_7_UNK_EM 

5 335824 CH22_3197FQ_619J1JMCE 
335853 CH22_3228FQ_626J5JJNK_EM 
335888 CH22_3261FQ_632_4JJNK_EM 
330020 Cl6_p2 
330211 C_6_p2 
10 337577 OC2_5864FCL_UNK_C65E1.Q 
307848 AB64186 

332797 CH22_13FGUB_2JJNK_C4G1.G 

332798 CH22_14FQ_6_5JJNK_C4G1.G 

332799 CH22_15FQ_6_6JJNK_C4Q1.Q 
15 334150 CH22_1429FQ_339_1JJNK_EM 

332933 CH22_154FG_38_7_UNK_C20H 
332980 CH22_204FGJ54_1_UNK^MA 
332984 CH22_208FG_54_6_UNK_EMA 
334223 CH22_1507FG_360_4JJNK_EM 
20 334297 CH22_158BFG_372_3JJNICEM 
327098 C21JIS 

334443 CH22_1742FG_387^JJNK_EM 

334444 CH22_1743FG_387_4_UNK_EM 
334447 CH22_1746FG_387_7_UNK_EM 

25 334570 CH22_1875FQ_405_1 1_UNK_E 
334749 CH22_2081FG_427_1JJNK_EM 
334777 CH22_2089FG_430_9JJNK.EM 
336034 CH22_3419FQ_678_5_UNK_W 
334960 CH22J281FG_465J29JJNK_E 
30 336441 CH22_3861 FQ_827_7_UNK_OJ 

330551 9851_g U39840 NMJ04496 AW1 35607 BE08745B BE087567 M177116 AW195705 AW750756 A1811008 AI634151 

BE348594 AW971075 A1347950 AK01455 AI073898 AA652680 AA613S71 AB18364 AA507550 AA693S92 
AI032599 AA991871 AI269801 AW948974 T74639 AA532907 AW949173 
330786 53973_3 BE379594 AI192455 AL033862 AI744012 AI761735AW243181 AI743687A1928223 A1423022 AI627855 

35 ~ AI636059 A1651571 AW802044 AI826995 A1431733 AI539125 AA863056 AW270910 AI768930 AW008835 

AW815183 AW591147 A1695294 AI672106 AA506358 AB08060 AA011556 AA962437 AI935483 BE219625 
AI004356 AW151394 AI21846S N66178 AI419784 AW242519 AW946907 D60374 AA989263 A1698799 
AA470460AI824167 

332247 372969J AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW118292 AA579216 N58172 

40 332398 20265 1 AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367811 AW387798 

R17370 A1908347 AA382932 R58449 HI 8732 AA371231 AW982899 AA713530 AW892946 R53463 H1 1063 
AW068542 Z40761 BE176212 BE176155 W23952 VU92188 AW374883 AA303497 AW954769 AA036808 
BE168063 AW382073 AW382085 ALD41475 H80748 AI078161 BE483983 AI805213 AI761264 W94885 
N94502 AI623772 AI419532 AI810302 AI634190 AW002516 AW150777 AB52312 AB67474 AW204807 
45 AI675502 AB37028 AW134715 BE328451 AI123157 A1560020 AB00745 M608631 AE48873 AA742484 

AW051635 H18646 AE45045 AA5071 1 1 A1640510 A1925594 AA115747 AA143035 AA151106 
332781 32044J AK001764 BE313896 AA380199 AA380151 AA19499S AW1 18089 AA495871 AW975219 AW085598 

AB78909 AW992310 AW992409 AK11857 AAS57643 AB04471 AI242589 AI623968 R03556 AI129100 
A1206500 AA680094 AA877784 AI023178 AE77519 AA424742 AE40654 AA232846 A1804273 AI382376 
50 AA001729 W90790 BE090656 AW295015 AI874598 A1431734 AM20517 AW769185 AH28355 AI192474 

AI820001 AA001929 AA706925 AI076676 AI4991 19 AI200493 AI695919 AB762T7 W69195 W69261 
AW305099 W90320 BE048357 AI658856 AA838534 AA233258 AI753393 AA709227 AI674387 AJ87261 6 
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TABLE 3B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 3. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed 



Ptey: Unique number corresponding to an Eos probeset 

Ret Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers. ■Dunham I. at aL" refers to Bw 

publication enfflfed The DNA sequence of human chromosome 22." Dunham I. et at, Nature (1999) 402^89-495. 
Strand: Indicates DNA strand from which earns were predicted, 

reposition: Indicates rtudeofJda positions of predicted earns. 



Pkey Ref 

333611 Dunham, LetaL 
333621 Dunham, LetaL 
333814 Dunham, LetaL 
333849 Dunham, LetaL 
333949 Dunham, LetaL 
333951 Dunham, LetaL 
333955 Dunham, LetaL 
334150 Dunham, LetaL 
334297 Dunham, LetaL 

334443 Dunham, LetaL 

334444 Dunham, LetaL 
334447 Dunham, LetaL 
334570 Dunham, I. etaL 
334777 Dunham, I. etaL 
335179 Dunham, LetaL 
335581 Dunham, LetaL 
335586 Dunham, I. etaL 

335809 Dunham, I. etaL 

335810 Dunham, I. etaL 
335822 Dunham, LetaL 
335824 Dunham, LetaL 
335886 Dunham, LetaL 
336034 Dunham, I. etaL 
338441 Dunham, LetaL 
337577 Dunham, I. etaL 
338260 Dunham, LetaL 

332797 Dunham, LetaL 

332798 Dunham, LetaL 

332799 Dunham, LetaL 
332933 Dunham, LeLal 
332980 Dunham, I. etaL 
332984 Dunham, I. etaL 

333168 Dunham, LetaL 

333169 Dunham, LetaL 
333452 Dunham, LetaL 
333456 Dunham, LetaL 
333458 Dunham, I. etaL 
334223 Dunham, I. etaL 
334749 Dunham, LetaL 
334960 Dunham, LetaL 
335293 Dunham, LeLal. 
335550 Dunham, I. etal. 
335853 Dunham, L etaL 

336624 Dunham, LetaL 

336625 Dunham, LetaL 
336679 Dunham, LetaL 
338255 Dunham, I. etal 

338561 Dunham, LeLal. 

338562 Dunham, LetaL 
338759 Dunham, LeLal 

338763 Dunham, I. etal. 

338764 Dunham, I. etaL 



Strand NLposBon 

Plus 65483684548507 

Plus 85974144597560 

Plus 7894165-7894252 

Plus 80183234018472 

Phis 8589834-8589791 

Plus 8592501-8592637 

Plus .8597414-8597560 

Plus 10529221-10529854 

Plus 13420934-13421058 

Plus 14298981-14299056 

Plus 14306433-14306492 

Plus 14308764-14308824 

Plus 14994868-14994943 

Plus 16259586-16260166 

Plus 21634405-21634526 

Plus 24976198-24976334 

Phis 24990333-24990497 

Plus 26310772-26310909 

Plus 26314767-28314849 

Plus 26364087-26364196 

Plus 26376860-26376942 

Plus 26934235-26934364 

Plus 29014404-29014590 

Plus 3418760644187663 

Plus 595377495678 

Plus 15458919-15459257 

Minus 216964-216798 

Minus 232147-231974 

Minus 232421-232307 

Minus 2035790-2035681 

Minus 51361654136019 

Minus 26326064632457 

Minus 37298964729788 

Mows 37308644730767 

Minus 51381654136019 

Minus 2631933-2631797 

Minus 51439424143806 

Minus 12734365-12734269 

Minus 16090686-16090106 

Minus 20160968-20160795 

Minus 22316408-22316275 

Minus 24668714-24668658 

Mows 26814629-26614506 

Minus 227714-227577 

M inus 229124-229024 

Minus 2035790-2035681 

Minus 15242294-15242231 

Minus 22311966-22311856 

Minus 22312594-22312465 

Minus 26582475-26582199 

Minus 2662814846628009 

Minus 26641232-26641101 
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329960 5091594 

329929 6165201 

330020 6671887 

326816 6552458 

326997 5867660 

327098 6682516 

330211 6013592 

328492 5868455 

329362 5868837 



Minus 1031-1162 

Minus 156410-156553 

Plus 172397-172491 

Plus 198354-198436 

Minus 71389-72147 

Mows 1061684-1062361 

Plus 59158-59215 

Minus 4609446241 

Minus 65688-68173 
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TABLE 4: shows a preferred subset of the Accession numbers for genes found in Table 3 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Ptey: 
ExAccn: 
UnigenelD: 
Unigene TBle: 
R1: 



Unique Eos probeset hfantiOer number 

Exemplar Accession number, Genbank accession number 

Unigene number 



Ratio of tumor to normal body (issue 



Pkmr 


ExAccn 


UnigenelD Unigene Title 


R1 


100819 


HG40204fT4290Hs2387 Transgkitaminase 


105 


102698 




Hs.1887 progastricsin (pepsinogen C) 


103 


102869 


AUKm 


H&572 orosomucoidl 


2ZS 


105370 


AA238476 


Hs22791 ESTs; Weakty similar to transmembrane pr 


103 


105645 


AA9A913A 


Hs.11325 ESTs 


14 


106094 




HS23317 ESTs 


105 


109014 


AA1SS790 


Hs262038 ESTs 


153 


109562 


rvlOI 1 


Hs.187931 ESTs; Moderately similar to voltage-gate 


10.8 


113021 


IrOOOJ 


Hs.129836 WAA102B protein 


103 


114124 


238595 


Hs.125019 ESTs; Highly similar to KIAA0886 protein 


213 


122791 


/wrauioo 


Hs.129836 WAA1028 protein 


124 


124352 


VUL tWO 


Hs.102406 ESTs 


102 


301042 


AI659131 


Hs.197733 ESTs 


243 


302005 


AlftfiQfififi 

/UODSDDO 


Hs.123119 ESTs 


363 


302410 


mm 004917 


Hs218366 EST cluster (not In UnlQene) with exonh 


263 


302881 


AA508353 


Hs.105314 relaxin1(H1) 


783 


303344 


AA255977 


Hs250646 ESTs; Highly similar to ubiquittn-conjug 


193 


303753 


AW503733 


HS5414 ESTs 


13 


310431 


AI420227 


Hs.149358 ESTs 


723 


311251 


AB55662 


Hs.197698 ESTs 


413 


311596 


AI682088 


Hs.79375 ESTs 


264 


312153 


AA759250 


Hs.118625 cytochrome b-561 


11 


312521 


AA033609 


Hs239884 ESTs 


113 


313676 


AA861697 


Hs.120591 EST cluster (not In UniGene) 


13.4 


314171 


AI821395 


Hs.193481 ESTs 


294 


314907 


A1672225 


Hs222886 ESTs 


193 


315051 


AW292425 


Hs.163484 EST 


153 


315052 


AA876910 


Hs.134427 ESTs 


20 


317548 


AI654187 


Hs.195704 ESTs 


142 


317869 


AW285184 


Hs.129142 ESTs; WeaHy similar to DEOXYH1BONUCLEAS 133 


318428 


AI949409 


Hs.194591 ESTs 


123 


318524 


AW291511 


Hs.159066 ESTs 


253 


319080 


Z45131 


Hs23023 ESTs 


163 


319763 


AA460775 


Hs.6295 ESTs 


143 


320324 


AF071202 


Hs.139338 ATP-btndlng cassette; sii>tamHy C (CFTR 


563 


321441 


AW297633 


Hs.1 18498 ESTs 


14.7 


322303 


W07459 


Hs.157601 EST cluster (not in UniGene) 


22 


322782 


AA056060 


H&202577 EST cluster (not in UniGene) 


184 


322818 


AW043782 


Hs293616 ESTs 


10.7 


323287 


AA639902 


H&104215 ESTs 


24.7 


324603 


AW016378 


Hs292934 ESTs 


242 


324617 


AA508552 


HS.19S839 ESTs 


54 


324658 


AJ694767 


Hs.129179 ESTs 


22 


324691 


AI217963 


H&293341 ESTs; Weakty similar to Prc-a2(XI) [H.sa 


103 


324696 


AA641092 


Hs257339 ESTs 


102 


324718 


AI557019 


Hs.1 16467 ESTs 


344 


330211 




CH35_p2gi|6013592 


123 


330430 


KG2261-HT2352 HsJ321 110 Antigen, Prostate Specific AIL Splice 


133 


330706 


AA121140 


Hs.177576 ESTs; Moderately similar to kynurantne a 


143 


330762 


AA449677 


Hs.15251 Human DNA sequence from done 437M21 on 18.5 


330892 


M149579 


Hs.91202' ESTs 


153 


330949 


K01458 


Hs-142896 ESTs 


103 
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331099 R36671 Hs.14846 ESTs 11 £ 

331151 R82331 K&268833 ESTs 13 

331889 AA4314C7 Hs.98802 H(miosaplensChnHnosome16BACdoneCIT 33.6 

332247 N58172 ESTs 142 

332398 AA340504 ESTs; Weakly similar to sMarto human 21.2 

332533 M994B7 H&325825 folate hydrolase (prostate-spaclllc mania 38.1 

332697 T94885 Hs.75725 carboxypepHdasa E 24 2 

332797 CH22_FGENESj6J 30.8 

332798 CH22J=GENESj6_5 66.8 

332799 CH22_FGENES.6_6 19.8 
334223 CH22J=GENES.360_4 20.3 

336624 . CH22JFGENES.6-3 435 

336625 CH22.R3ENES.64 375 
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TABLE 4A shows the accession numbers for those primekeys lacking unigenelD's for Table 
4. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed Gene clusters were compiled using sequences derived from Genbank ESTs 
and niRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



10 Ptey: 



CAT number 
Accession; 



Unique Eos probeset Identifier number 
Gene cluster number 
Genbank accession numbers 



15 Ptey CAT number 



20 



25 



30 



336624 CH22_4071FQ_6_3_ 

336625 CH22_4072FG_6_4_ 
330211 c 5_p2 

332797 CH22_13FG_6J JJNK_C4G1.G 

332798 CH22_14FG_6_5_UNK_C4G1 .G 

332799 CH22_15FG_6_6 UNK.C4G1.G 
334223 CH22_1507FG_360_4_UNK_EM 
332247 372869J 

3323S6 20265.1 



Accession 



AA669097 AA513815 AA026798 AA676526 M704429 AA704269 AW118292 AA579216 N58172 
AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW36781 1 
AW367798 R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 
R53463 H11063 AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374383 AA303497 
AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 BE463983 
A1805213 AI761264 W948B5 N94502 AIB23772 AI419532 AI810302 AI634190 AW002516 AW150777 
AI352312 AB67474 AW204807 AB75502 AI337026 AW134715 BE328451 AI123157 AI560O20 
AI300745 AI608631 AI248873 AA742484 AW051635 H18646 AE45045 AAS071 1 1 AB40510 AI925594 
AA1 15747 M143035 AA151106 
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TABLE 4B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 4. For each predicted exon, we have listed the genomic sequence 
5 source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Unique number eocresponding to an Bos prcbaset 

Sequence source. The 7 digit numbers ki thbcdumn are 6enbankldenfflter(GI)numbas. "Dunham L et al." refers to the ptMcaflon entiUed The 
DNA sequence of human chromosome 22." Dunham L el aL, Kalure (1999) 402:489-425. 
IntJicatas DNA strand from which axons were predicted. 
Indicates nucleotide positions of predicted axons. 



Pkey: 
Ret: 

10 

Strand: 
NLposKon: 



Ptey 


Ftef 


Strand 


Ntposiflon 


332797 


Dunham, L eLai 


Minus 


216964-216798 


332798 


Dunham, LetaL 


Minus 


232147-231974 


332799 


Dunham, LetaL 


Minus 


232421-232307 


334223 


Dunham, 1. eLaL 


Minus 


12734365-12734269 


336624 


Dunham, Letal. 


Minus 


227714-227577 


336625 


Dunham, LetaL 


Minus 


229124-229024 


330211 


6013592 


Plus 


6915*59215 
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10 



TABLE 5: 1170 GENES UP-REGULATED IN PROSTATE CANCER COMPARED TO 

NORMAL ADULT TISSUES 

Table 5 shows 1170 genes up-regulated in prostate cancer compared to normal adult tissues. 
These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip array such 
that the ratio of "average" prostate cancer to "average" normal adult tissues was greater than 
or equal to 3.44. The "average" prostate cancer level was set to the 85 th percentile amongst 73 
prostate cancers. The "average" normal adult tissue level was set to the 85 th percentile 
amongst 162 non-malignant tissues. In order to remove gene-specific background levels of 
non-specific hybridization, the 7.5 ttl percentile value amongst the 162 non-malignant tissues 
was subtracted from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey: 




Unique Eos prabeset Identifier number 




ExAccn: 




Exemplar Accession number, Genbank accession number 




Unkjenetr. 




Unlgene number 






Unlgene! 
HI: 


We: 


Ratio of tumor to normal tissue 




Pkey 


ExAccn 


UnlgenefD 


Unlgene Title 


HI 


446057 


AM20227 


Hs.149358 


ESTs, Weakly similar to A46010 X-Gnked 


86.42 


400302 


N46056 


Hs.1915 


folate hydrolase (prostate-specific memb 


65.46 


414569 


AF109298 


Hs.1 18258 


prostate cancer associated protein 1 


58.36 


417407 


AA923278 


Hs290905 


ESTs, Weakly similar to protease ptsapi 


56.16 


431579 


AW971082 


H&222886 


ESTs, Weakly similar to TRHY_HUMAN TRICH 


53.38 


409361 


NM_005982 Hs£4416 


sine ocuDs homsobox (Droscphila) homolo 


AO OO 


409731 


AA125985 


H&56145 


thymosin, beta, Identified in neuroblast 


jtC OA 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43.48 


420154 


AI093155 


Hs35420 


JM27 protein 


41.12 


433466 


AA508353 


Hs.105314 


relaxin 1 (H1) 


39.88 


400296 


AA305627 


Hs.139336 


ATP-bfnding cassette, sub-family C (CFTR 


33.42 


400292 


AA250737 


Hs.72472 


ESTs 


38.00 


432887 


AI926047 


Hs.162859 


ESTs 


36.48 


439176 


AI446444 


Hs.190394 


ESTs, WeaHy similar to 828096 line-1 pr 


38.45 


430722 


AW96B543 


H&203270 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


3320 


437052 


AA861697 


Hs.120591 


ESTs 


33.02 


418396 


AI765805 


Hs26691 


ESTs 


32.68 


434036 


AI659131 


Hs.197733 


hypothetical protefn MGC2849 


32.44 


407709 


AA456135 


HS23023 


ESTs 


32.10 


426747 


AA535210 


Hs.171995 


kallikrein 3, (prostate specific antigen 


31 JO 


407168 


R45175 




ESTs 


31.72 


440260 


A1972867 


Hs.7130 


copina W 


3052 


421513 


X00949 


Hs.105314 


relaxin 1 (HI) 


30.10 


416370 


N90470 


Hs203697 


ESTs, Weakly similar to 138022 hypolhetl 


29.68 


407122 


H20276 


Hs31742 


ESTs 


29.24 


400287 


S39329 


Hs-181350 


kallikreln 2, prostatic 


28.90 


432244 


AI669973 


Hs.200574 


ESTs 


2B.74 


451939 


U80456 


Hs.27311 


single-minded prosopMa) homoiog 2 


2B.74 


415989 


AI267700 


Hs.111128 


ESTs 


28.34 


418961 


AW967646 


H&23Q23 


ESTs 


27 .34 


425628 


NM.004476 Hs.1915 


folate hydrolase (prostate-specific memb 


27.32 


456509 


AA654650 


Hs.282906 


ESTs 


2724 


448290 


AK002107 


Hs.20843 


Homo sapiens cDNA RJ1 1245 fis, clone PL 


27.16 


428336 


AA503115 


Hs.183752 


mkxosemlnoprotein, beta- 


26.17 


450096 


AI682088 


H&223368 


hokxarboxylase synthetase (bioSn-fprop 


25X0 


400299 


X07730 


Hs.171995 


kaHikrein 3, (prostate specific antigen 


2451 


437571 


AA760894 


Hs.153023 


ESTs 


24.74 


453160 


AK63307 


Hs.146228 


H2B histone tamily, member L 


24X6 


453096 


AW294631 


Hs.1 1325 


ESTs 


24.46 


425075 


AAS06324 


H&1852 


acid phosphatase, prostate 


2423 


407202 


N58172 


Hs.109370 


ESTs 


24.18 
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424846 AU077324 Hs.1832 neurcpepfideY 2357 

453370 AI47Q523 Hs.182356 ATP-bintCng cassette, sub-tamily C (CFTR 23.16 

422805 AA438989 Hs.121017 H2AWstana(am3y, member A 2252 

444917 R68851 Hs.144997 ESTs 22.26 

5 408826 AF216077 H&48376 Homo sapiens doaa HB-2 mRNA sequence 2252 

413597 AW3Q2885 Hs.117183 ESTs 21.76 

426429 X73114 Hs.169849 myosin-binding protBin C, slow-type 2152 

435981 H74319 Hs.188620 ESTs 21.12 

432966 AA650114 ESTs 2157 

10 418848 AB20961 Hs.193465 ESTs 2156 

405685 2050 

443271 BE568568 Hs.195704 ESTs 19^8 

418819 AA228776 Ks.191721 ESTs ig.94 

420757 X78592 Hs59915 androgen receptor {dilrydrotestostercrte r 19.72 

15 418994 AA296520 Hs59546 satecfti E (endothelial adhesion molecul 1956 

429918 AW873988 Hs.119383 ESTs 1954 

415539 AI733881 Hs.72472 ESTs 18,43 

450382 AA397658 Hs.60257 Homo sapians eONA HJ13598 fis, done PL 1854 

418829 AA516531 Hs55999 NK homeobox (Drosophila), family 3, A 1858 

20 429984 AL0501Q2 Hs527209 hypofhefical protein FU21617 1752 

443822 AI087412 Hs.143611 ESTs, Weakly similar to 2004399A chromos 1758 

431676 AI685464 Hs592638 gb.H88[04Jf1 NCLCGAP_Pr28 Homo sapiens 1754 

410330 AW023S30 Hs.46786 ESTs 1752 

432441 AW292425 Hs.163484 ESTs 1741 

25 452792 AB037765 Hs50652 K1AA1344 protein 1759 

445472 AB006631 Hs.12784 Homo sapiens mRNA for KIAA0293 gene, par 1750 

414565 AA502972 Hs.183390 hypothetical protein HJ13590 1652 

430487 087742 Hs541552 MAA026B protein 16.72 

431716 D89053 Hs5680l2 fatry-acW-Coenzyrne A lipase, long-chain 1650 

30 4.19536 AA603305 gbiip12d11.s1 NC|_CGAP_Pr3 Homo sapiens 1650 

439677 R82331 Hs.1 64599 ESTs 16.46 

449625 NMJ014253 Hs53786 odz (odd Oz/ten-m, Drosophila) homolog 1 1652 

408430 S79876 Hs.44926 dpepfidy^epSdase IV (C026, adenosine 1658 

447033 AB57412 Hs.157601 ESTs 1652 

35 453006 AI362575 Hs.167133 ESTs 15.74 

431474 AL133990 Hs.190642 ESTs 15.70 

420218 AW95B037 Hs52437 libosornaj protein L4 15.64 

408000 L11690 Hs.620 bullous pemphigoid antigen 1 (230/240kD) 1554 

416208 AW291168 Hs.41295 ESTs, Wealdy similar to MUC2_HUMAN MUCIN 15/48 

40 430226 BE245562 Hs5551 adrenergic, beta-2-, receptor, surface 1540 

415263 AA948033 Hs.130853 ESTs 1558 

432437 W07088 Hs593685 ESTs 1526 

428398 AE49368 Hs58558 ESTs 1551 

429900 AA460421 Hs50875 ESTs 1450 

45 449156 AF103307 Hs.171353 prostate cancer anflgen 3 1459 

411096 U80034 Hs58583 mitochondrial Intermediate peptidase 1451 

435974. U29690 Hs57744 Homo sapiens beia-1 adrenergic receptor 14.76 

444484 AX002126 Hs.1 1260 hypothetical protein FU1 1264 14.76 

422728 AW937826 Hs.103262 ESTs, Weakly simitar to ZN91 HUMAN ZINC 1450 

50 418601 AA279490 Hs56368 caimegln 1456 

448999 AF179274 Hs22791 transmembrane protein with EGRke and 1455 

445885 AI734009 Hs.127699 KIAA1603 protein 1444 

452712 AW838616 gbflC5-LT0054-14020M13JX)1 LT0054 Homo" 1452 

432189 AA527941 gb:nh30c04.s1 NO_CQAP_Pr3 Homo sapians 14.12 

55 424565 AW102723 Hs.75295 guanybte cyclase 1, soluble, alpha 3 13.78 

429290 AF203032 Hs.198760 neurofBamant, heavy polypeptiae (200kD) 1357 

419264 AA877104 Hs293672 ESTs, Weakly similar to ALUBJiUMAN I!!! 13.40 

416445 A1543004 Hs500678 KIAA01 35 protein 13.32 

407275 AI364186 gtxrw34h07jr1 NCI_CGAP Ut4 Homo sapiens 1354 

60 408369 R38438 Hs.182575 sotute carrier family 15 (H4/pepHde tra 1351 

446720 AI439138 Hs.140546 ESTs 1106 

434988 AI418055 Hs.161160 ESTs 13.02 

448172 N75276 Hs.135904 ESTs 1258 

416182 NMJ04354 Hs.79069 cyctnG2 1254 

65 420544 AA677577 Hs58732 Homo sapiens Chromosome 16 BAG done CIT 12.79 

445413 AA151342 Hs.12677 CGM47 protein 1264 

452588 AA889120 Hs.1 10637 homeoboxAlO 1252 

407819 R42185 Hs574803 ESTs 1250 

433444 AW975324 Hs.129816 ESTs 1250 
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421059 AI654133 H&30212 thyroid receptor interacting protein 15 1i30 

420077 AW512260 Hs.87767 ESTs 1124 

453930 AM19466 H&3S727 hypothetical protein RJ10903 1122 

441610 AW576148 Hs.148376 ESTs 1120 

S 451009 AA013140 Hs.115707 ESTs 1116 

433764 AW753676 H&39982 ESTs 1116 

440286 U29589 Hs.7138 choGnargte recaptor, muscarinic 3 12.04 

443912 R37257 Hs.184780 ESTs 1152 

419526 AI821895 Hs.193481 ESTs 1151 

10 423073 BE252922 Hs.123118 MAD (mothers against decapentaplegic, Dr 1157 

452784 BE463857 Hs.151258 hypolhefical protein FU21062 1156 

414422 AA147224 Hs.71814 ESTs 11.76 

450203 AF097994 H&301528 L-kynurenlra/alpha^lnoaclipatB amhotra 1158 

436679 A1127483 Hs.120451 ESTs, Weakly similar to unnamed protein 1150 

IS 440901 AA909358 Hs.128612 ESTs 1150 

448045 AJ297436 H&20168 prostate stamceD antigen 1151 

433887 AW204232 Hsl79522 ESTs 1150 

434980 AW770553 H&293640 sterol 0-acyHransferase (acyt-Coenzyma 1158 

425905 AB032959 Hs.161700 novel C3HC4 type Zinc finger (rfngSnge 1153 

20 434680 T11738 Hs.127574 ESTs 1152 

449650 AF055575 Hsl97647 cald'um channel, vortage-dnpendent, L ty 11.18 

431173 AW971198 Hs594068 ESTs 11.16 

434539 AW748078 H&214410 ESTs, Weakly similar to MUC2_HUMAN MUCIN 11.16 

410037 AB020725 HS58009 KIAA0918 protein 11.14 

25 417708 M74392 Hs50495 ESTs 11.14 

458332 AKJ00341 Ks220491 ESTs 11.12 

420381 D50840 Hs501782 phosphodiesterase 3B, cGMP-tnhMed 11.10 

425665 AK001050 Hs.159066 hypothetical protein FU10188 1158 

425710 AF030880 Hs.159275 solute carrier family, member 4 1158 

30 428728 NM-016625 Hs.191381 hypothetical protein 1154 

407021 U52077 gb:Human marinerl transposase gene, comp 1152 

410733 D84284 Hs56052 CD38 antigen (p45) 1152 

401714 1050 

434485 AI623511 Hs.118567 ESTs 1059 

35 415786 AW419196 Hs557924 hypothetical protein HJ13782 1057 

452340 NM_002202 Hs505 ISL1 trarBatption factor, UM/horneodorna 1055 

453628 AW243307 Hs.170187 hypothetical protein 10.72 

408063 BE086548 Hs.42346 caJdneuriivbrnding prote'm calsarcin-1 10.67 

417687 AI828596 Hs550691 ESTs 1054 

40 434666 AF151103 Hs.112259 ToeB receptor gamma locus 1053 

432374 W68815 Hs501885 Homo sapiens cDNA FU11346 (is, clone PL 1050 

428819 AL135623 Hs.193914 WAA0575 gene product 10.48 

413409 AI638418 H&21745 DEAD/H (Asp-Glu-Ate-Asp/His) box polypep 10.44 

428775 AA434579 Hs.143691 ESTs 1021 

45 436556 AI364997 Hs.7572 ESTs 1010 

441690 R81733 Hs53106 ESTs 10.14 

419852 AW503756 H&286184 bypotheiJcal protein <1J551 D2^ 10.10 

421991 NMJM4918 Hs.110488 WAAO990 protein 1054 

423698 AA329788 Hs.1098 DKFZp434J1813 protein 1052 

50 452039 AI922988 Hs.172510 ESTs 1050 

433043 W57554 Hs.125019 ESTs 958 

433927 A1557019 Hs.116467 small nuclear protein PRAC 9.97 

445424 AB028945 Hs.12696 corlactteSH3dornaW)Inding protein ' 9.96 

432240 AI694767 Hs.129179 Homo sapiens cDNA FU13581 Us, done PL 958 

55 433104 AL043002 Hs.128246 ESTs, Moderately similar to unnamed pro! 954 

452744 AI267652 Hs50504 Homo sapiens mRNA; cONA DKFZp434E082 (fr 9.82 

431217 NM-013427 Hsl50830 Rho GTPase activating protein 6 9.75 

427398 AW390020 H&20415 chromosome 21 open reading frame 11 9.70 

446896 T157S7 H&22452 Homo sapiens mRNA for KIAA1737 protein, 9.70 

60 421470 R27496 Hs.1378 annexinA3 9.64 
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He 1KQ910 


CO 1 o 


3.65 


400 \£l 


/UD900/ 1 


Hs9Q4110 


ESTs 


165 




423396 


AJ382555 


Hs.1 27950 


biomodomaln-contahitng 1 


3.65 




419346 


AI830417 




polybromo 1 


164 




441540 


C01367 


Hs.127128 


ESTs 


3.64 


60 


446501 


AI302616 


Hs.150819 


ESTs 


3.64 


459527 


AW97755S 


H&291735 


ESTs, Weakly similar to 178885 serine/lh 


3JS3 




446320 


AF126245 


Hs.14791 


acyKkienzynie A dehydrogenase family, me 


3£3 




435706 


W31254 


Hs.7045 


GLOW protein 


3JB3 




400110 








162 


65 


410313 


R10305 


Hs.1 85683 


ESTs 


162 


414713 


6E465243 


Hs.12664 


ESTs 


3.62 




436279 


AW900372 


Hs.180793 


ESTs, Weakly similar to S65657 alpha-1C- 


162 




439818 


AL360137 


Hs.19934 


Homo sapiens mRNA full length Insert cDN 


3.62 




451797 


AW663858 


Hs56120 


small inducible cytokine subfamSy E, me 


162 




451294 


AJ457338 


H&29894 


ESTs 


162 
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404194 


AC1 IQQAJ 


Un OOOQ/fA 


nomo sapiens rnu iosu rnnJv\, paiuai cos 


3.62 




404&W 












4UH101 


AWsCOOW 


n$.1Z39/o 


^ft^4 mlnlfwf m«m(aTa LliuifM 7 

uLX^-fBiaisa pnHBui wnaso 7 


O.QC 




435846 


A ATArtOTA 

AA7UGB7U 


HS, 14304 


tolS 


O.DI 


c 
J 


432833 


N01U75 


(J* itT-tn-i 


CO IS 


OJO I 




427278 


A AjfAAOCO 

AA4W£09 


Un ytQCQQ 


cols 


Q fH 

OJO 1 




433495 


AlAW7TTO.il 

AW3/o/o4 


nS.71 


alpha*2*fllyooprotein 1, zinc 


0 cn 

OnfU 




403137 








OnJW 


in 


404165 








OJ3M 


409571 


A AC AAA At\ 

AA504249 


U. 4 Q7CQC 
HS. 1O/OO0 


tolS 


0 en 




410561 


Bc54Q255 


Hs.6994 


nomo saptens cuna. rijzzw* ns, aons n 


Q RA 
O.DU 




412924 


Db01o4Z2 


Un TCOCQ 


H2A hlstono family, ntsmbsr Y 


q ra 




434228 


740Aj(7 


Uo A0007Q 


nomo sapiens rnUii/o 1 ninnA, cornpraiQ cos 




1 c 
LJ 


436797 


AA731491 


Un 170S40 

nS.i7o5lo 


hypothGD*caI protsbi MQC14879 


0 en 

O.DU 


437162 


AlA/nnf pap 

, AWO055O5 


Hs.5464 


thyroid hormone receptor coactrvafing pr 


0 en 
0.0U 




437444 


H450Q8 




COT- 


«CA 

&DU 




404210 












446157 


Bc27ua2a 


nS.131740 


nomo sapiens cuna, rLJZZboz ns, ciona n 




ZU 


437587 


AI591222 


HS. 122421 


Hinnan DNA sequence from clone RP1-187J1 1 


O CO 


423147 


AA9B7927 


nS. 131740 


nomo sapiens cuna: rijzzxz us, ctona n 






452226 


AA02489B 


AACAA4 

nS29o0QZ 


coTS 


OaTO 




443775 


AFZ91664 


H&20473Z 


matrix metaSopioteinase 26 


inOD 




452501 


AB037791 


■ f _ AA*MA 

H&29716 


nypotneucal protetn HJiwoO 


OJDO 


*>< 


428647 


AAS30050 


MS. 124344 


to IS 


OaDO 


422443 


NM_0147O7 


KS.1 16753 


histone daac8tytasa7B 


4 KR 

0JO0 




447966 


A A04AOAC 

AA340oUo 


if— 4ACOQ7 

nS.10300/ 


to i s, w eawy strruiar to nomoiog or rai ^ 






420892 


AW975075 


U» <t74£QO 

HS.172509 


nuctear phosphoprotetn similar to S. cer 


Oaw 




420230 


AL034344 




lorKneaa box 01 


QRC 


M) 


418428 


Y12490 


H&85092 


II,.,, .f,| 1 1 , i. .J,. ■ 44 . 

tnyroki noirnone receptor tnteracior 1 1 


3^>4 


428949 


A A J iA4 PA 

AA442153 


HS.104744 


nypotneucal protein ui\rZp434J0oi7 


O CA 




444929 


AIbooo41 


nS.161034 


CCTn 

COlS 


4 RA 




433339 


AF019226 


H&8036 


glioblastonia ovsrexpressed 


9 RA 




424369 


R87622 


m nert4 A 

H&26714 


1/IAA40Q-4 «m4nln 

K1AAI001 protein 


O.04 




433002 


AF048730 


n&27990o 


MantmT4 

cyc&nTi 




435425 




Un OliMC 

n&d14l0 


CCTc 
COIS 






415621 


AI6486Q2 


HS. 131 189 


CCTn 

to I S 


■9 CO 
OA) 




416974 


AHJ10Z33 


nSiJUOO/ 


RALBP1 assoctated Eps domain containing 


QR5 
«1a?0 




405793 








O CA 


/(A 


409770 


AW499536 




n*»«l lf_LlC-DDAn_tILA.1Q-AJ II rl MIU Ufl/^ C 


O CO 
Oa>C 


425305 


AA363025 


HS.155572 


Human clone 23601 mRNA sequence 


O CO 




428939 


AW23655Q 


HS. 131914 


ESTs 


qen 




438388 


AA805349 


HS.44698 


CCTn 
CO IS 


O CO 
OnMl 




443703 


. AVo4ol7/ 


nsunouzi 


to IS 


O CO 




457940 


ai ocmcn 


|_|_ OAJ4C 

HS.3U445 


Homo sapiens TRIparfite motif protein ps 


O CA 
Oa0& 


402444 












409643 


AW45UOOO 




CCTn 

to IS 


^ R1 




418250 


U29926 


HS.B3918 


adenosine monophosphate deaminase (isofb 


OR1 
Oa)1 




432745 


Aiaoioofi 
AIo21;£b 




qOju/oIUDjo wi^|jiA*wr_j'io notnu sapiens 






414222 


AU351/3 


nS.o/o 


sorb%}t dehydrogenase 


QR1 


430061 


AOA0701 7 


Llo 94A4flQ 


fxlAAtoaO pnj It! In 






421491 




11— AOTOC 


CCTc 

tois 


q ka 




422384 


A A0O>IAT7 


ns.4&4oo 


om protein r 


3\50 




434565 


TCOIT) 

T5Z171 




COIS 


q cn 
o^u 




438379 


N23018 


11. 4714A1 

ns.1 71391 


C-tenninaJ binding protein 2 


0 cn 


439741 


DCJ/9040 


Me CiQtiA 


nuinu sapiens niruvt hm cungui uuwii 


3a50 






Q07Mn 


Hs.33417 


Hnmn caniflni rfiMA' FLl22flOS fis. done K 


350 




447805 


AW627932 


Hs.19614 


gemin4 


350 




454265 


H03556 


Hs.300949 


ESTs, Weakly similar to thyroid hormone 


350 


60 


418838 


AW385224 


H^35198 


ectonudeob'de pyrophosphatase/phosphodi 


350 


448804 


AW512213 


Hs.42500 


ADP-ribosybtion factor-like 5 


350 




409617 


BE003760 


Hs£5209 


Homo sapiens mRNA; cDNA DKFZp434K0514 (f 


3.49 




434075 


AW003416 


Hs.160604 


ESTs 


3.49 




444190 


AI878918 


Hs.10526 


cysteine and glyt-ne-rich protein 2 


3.49 


65 


435017 


AA336522 


Hs.12354 


angiotensin II, type 1 receptor-associat 


3.48 


423445 


NM 014324 


Hs.128749 


alpha-methytacyl-CoA racemase 


3.48 




420271 


A1954365 


Hs.42892 


ESTs 


3.48 




443664 


AI681307 


Hs.166674 


ESTs 


3.48 




444168 


AW379879 




gb^C1-HT025^0ai19W)11-fOt HT0256Homo 


348 




446074 


AA079799 


H&29263 


hypothetical protein FU1 1895 


3.48 
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452582 AL137407 H&2S911 Homo sapiens mRNAjcON A DKFZp434M232(fr 3/8 

431542 K83010 H&5740 ESTs 3/8 

432697 AW975050 H&2938S2 ESTs, WeaMy similar to ALU4.HUMAN ALUS 3/8 

435572 AW975339 H&239828 ESTs, WeaMy similar to GAG2_HUMAN RETRO 3/7 

407192 AA6O920O gba«2e02j1 ScaresJestlsJJHT Homo sap 3/7 

413435 X51405 Hs.75360 carboxypepfidase E 3/6 

447210 AF035269 Hs.17752 ptosphafidytserirte-spedTtc phosptiolfeas 3/6 

447958 AW796524 Hi68644 Homo sapiens nfaosomaJ signal peptidase 3/8 

425312 AA354940 Hs.145958 ESTs 3/6 

442007 AA301116 Hs.142838 nucleolar phosphopiotaln Nopp34 3/8 

417455 AW007066 Hs.18949 ESTs, WeaHy similar to CA2BJIUMAN COLLA 3/5 

426931 NM_003416 Hi2076 zinc linger protein 7 (KOX 4, done HF.1 3/5 

408739 W01S58 Hs238797 ESTs, Moderately similar to K8022 hypot 3/5 

436024 AI800041 Hs.190555 ESTs 3/5 

408418 AW983897 Hs.44743 WAM435 protein 3/5 

409151 AA306105 H&50785 SEC22, vesicle trafficking protein (S. c 3/4 

418626 AW299508 Hs.135230 ESTs 3/4 

420560 AW207748 H&59115 ESTs 3/4 

420686 AB50339 Hs.40782 ESTs 3/4 

428870 AA436831 Hs.36049 ESTs 3/4 

436754 AI061288 Hs.133437 ESTs 3.44 

437960 AI669586 Hs.222194 ESTs 3/4 

452300 AW628045 HSJ28896 Homo sapiens mRNA full length Insert cDN 3/4 

421887 AW161450 Hs.109201 CGW8 protein 3/4 
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TABLE 5A shows the accession numbers for those primekeys lacking a unigenelD in Tables 
5, 6, and 7. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: 

CAT number. 
Accession: 



Unique Eos probeset iderrfiRer number 

Gene cluster number 

Genbank accession numbers 



Ptey 

407596 
408432 
409752 
409770 
411440 
411479 

411624 
412991 
414269 
415123 
415715 
416288 
416289 
417730 
418636 
419346 
419536 
420111 
422219 
424179 
424242 
428002 
429163 
432189 
432340 
432363 
4329S6 
433586 
433641 



433687 
433891 
434415 
434565 
434804 
437113 
444168 
448212 
448310 
451746 



CAT number 

1003489J 

1058667J 

115301J 

1154048.1 

124577J 

1247077J 

1252166J 

134248J 

143133J 

1523390J 

1548818J 

1585983J 

1586037J 

1695795J 

177402J 

184129J 

185688J 

190755J 

213547J 

2363B9J 

237181J 

285602J 

300543J 

342819J 

345248J 

345469J 

356839J 

370470_1 

37186J 



373061J 

376239J 

385931J 

38898 1 

393481J 

433234.1 

593829.1 

755099.1 

757918.1 

883303.1 



Accession 

R86913 R86901 H25352 R01370H43764 AW044451 W21298 
AW195262 R2786B AW811262 

AW963990 AA078196 AW749482 AA077468 BE151571 AA376917 
AW499536 AW499553 AW502138 AW499537 AW502136 AW501743 
AW749402 AW749403 Z45743 R80376 AA093358 

AW848047AW848202 AWB48631 AW848142AW848702AW848121 AW848632 AW848140 AW848571 

AW848009 AW848067 AW848069 AW848905 AW848214 

BE145964 BE146286 AW854564 

AW949013AA126111 

AA298489AA137165 

060925 D60828 D80787 

F30364F36559T15435 

H51299 H44619 H46391 R86024 H51892 T72744 

W26333R05358H44682 

Z44761 R25801 R11926 R35604 

AW749855 AA225995 AW750208 AW750206 

AI830417AA236612 

AA603305 AA244095 AA244183 

AA255652 AA280911 AW967920AA262684 

AW978073 AW978072 AA807550 AA306567 

F30712 F35665 AW263888 A1904014 AI904018 AA336927 AA336502 

AA337476 AW966227 AA450376 AW960222 AA381051 

AA418703AA418711 BE071915BE071920BE071912 

AA884768AW974271 AA592975AA447312 

AA527941 AI810608 AI620190 AA635266 

AA534222 AA632632 T81234 

AA534489 AW970240 AW970323 

AA650114 AW974148 AA572946 

T85301 AW517087 AA601054 BE073959 

AP080229 AF080231 AF08Q230 AF080232 AF08Q233 AF080234 BE550633 AI636743 AW614951 BE467547 
AI680833 AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 
AI583718 AI672574 N25695 AW665466 AI818326 AA126128 AI480345 AW013827 AA248638 AE14968 
AA204735 AA207155 AA206262 AA204833 AW003247 AW49S808 A1080480 AI631703 A1651023 AI867418 
AW818140 AA502500 AI206199 AI671282 AB52545 BE501030 AI652535 BE465762 AA206331 AW451866 
AA471088 AA206342 AA204834 AA206100 AW021661 AA332922 N66048 AA703396 H92278 AW139734 
H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE46661 1 AI206344 AA574397 AA348354 
AI493192 

AA743991 AA604852 AW272737 
AA613792 AW182329 T05304 AW858385 
BE177494 AW276909 AA632849 
T52172AF147324 T52248 
AA649530 AA659316 H64973 
AA744693AW750059 
AW379879AI126285H12014 
A1475858AW969013 
AI480316AW847535 
MS6178AI813822 056993 
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452560 922216 1 BE077084 AW139963 AW863127 AW806209 AWB06204 AW806205 AW806206 AW80621 1 AW806212 

AW806207 AW806208 AWB06210 A1907497 
452712 928309J AW838616 AW838660 BE144343 AI914520 AW88B910 BE184854 BE184784 

453773 980699 1 AL133761 AL133767 

5 455276 1272541_1 BE176479BE176678BE176357BE176550AW886079BE176676BE176615BE176555BE176489BE178610 

BE176382 

455309 1278153 1 AW894017 AW893956 AW894032 
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TABLE 5B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Tables 5, 6, and 7. For each predicted exon, we have listed the 
genomic sequence source used for prediction. Nucleotide locations of each predicted exon 
are also listed. 



Ploy: Unique number corresponding to an Eos probesot 

Ret Sequence source. The 7 digit numbers In mis oolumn are Genbank Identifier (Gl) numbers. "Dunham letaL refers to the 

publication entitled The DMA sequence of human chromosome 22.' Dunham L at aL, Nature (1999) 402:489-495. 
Stand: Indicates ONA strand from which axons ware predicted. 

Nt_pos!Bon: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NtjiosItJon 


401045 


8117619 


Plus 


80044-90184,91111-91345 


401424 


8176894 


Plus 


24223-24428 


401451 


6S34068 


Minus 


119926-121272 


401714 


6715702 


Plus 


9648446681 m 


401747 


9789672 


Minus 


11859S-1t8816,119119-119244,119609-119781.ia»422-120990,130161-130381,130468-13CS93,131097- 




131258,131866-131932,132451-132575,133580-134011 


401785 


7249190 


Minus 


165776-165996.166189-166314,166408-166569,167112-167268,167387-167469,168634-168942 


401619 


7467933 


Minus 


28217-28486 


402408 


9796239 


Minus 


• 110326-110491 


402444 


9796614 


Plus 


28391-28517 


402791 


6137008 


Minus 


51036-51207 


403047 


3540153 


Minus 


59793-59968 


403137 


9211494 


Minus 


92349-92572^2958-93084^3579^3712^3949-94072^4591-94748^5214-95337 


403721 


7528046 


Minus 


156647-157366 


403764 


7717105 


Minus 


118692-118853 


403797 


8099896 


Minus 


123065-125008 


404165 


9926489 


Minus 


69025-69128 


404210 


5006246 


Plus 


169926-170121 


404253 


9367202 


Minus 


55675-56055 


404561 


9795980 


Minus 


69039-70100 


404571 


7249169 


Minus 


112450-112648 


404721 


9856648 


Minus 


173763-174294 


404915 


7341766 


Minus 


100915-101087 


404939 


6862697 


Phis 


175318-175476 


405403 


6850244 


Minus 


37491-37670,40951-41031 


405685 


4508129 


Minus 


37956-38097 


405718 


9795467 


Plus 


113080-113266 


405793 


1405887 


Minus 


89197-89453 


405876 


6758747 


Plus 


3969440031 


405917 


7712162 


Minus 


106829-107213 


405414 


9256407 


Phis 


49593-49850 


406554 


7711566 


Plus 


106956-107J21 
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TABLE 6:286 GENES ENCODING EXTRACELLULAR OR CELL SURFACE 
PROTEINS UP-REGULATED IN PROSTATE CANCER COMPARED TO 
NORMAL ADULT TISSUES 

Table 6 shows 286 genes up-regulated in prostate cancer compared to normal adult tissues 
5 that are likely to be extracellular or cell-surface proteins. These were selected as for Table 5 
and the predicted protein contained a structural domain that is indicative of extracellular 
localization (e.g. egf, 7tm domains). 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Ptey. 




Unique Eos probaset Henfiler number 




ExAccn: 




Exemplar Accession number, Qanbank accession number 




UnigenelD: 
Unlgene Title 




Unlgene number 








Unlgene gene fitt 


s 




HI: 




Ratio of tumor to normal tissue 




Ptey 


ExAccn 


UnigenelD 


Uningene Title 


R1 


409361 


KM.005982 


Hs.54416 


she ocuTis homeobox (Drosophfla) homolo 


4828 


409731 


AA125985 


Hs.56145 


thymosin, beta, Identified In neuroblast 


4524 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43/18 


420154 


AI093155 


HS55420 


JM27 protein 


41.12 


426747 


AA535210 


Hs.171995 


teEkreln 3, (prostate specific antigen 


3130 


400299 


X07730 


Hs.171995 


kaHikreln 3, (prostate specific antigen 


2451 


425075 


AA506324 


Hs.1852 


acid phosphatase, prostate 


2423 


424846 


AU077324 


Hs.1832 


neuropeptide Y 


2357 


405685 








2050 


420757 


X78592 


Hs.99915 


androgen receptor (dihydrotestosterone r 


19.72 


418S94 


AA296520 


HS39546 


selecfin E (endothelial adhesion molecul 


1956 


452792 


AB037765 


Hs30652 


K1AA1344 protein 


17.39 


445472 


AB006631 


Hs.12784 


Homo sapiens mRNA for KIAA0293 gena, par 


17jOO 


414565 


AA502972 


Hs.183390 


hypothetical protein FU 13590 


16.82 


431716 


089053 


H&268012 


fatty-acid-Coenzyme A ligase, long-chain 


16.60 


408430 


S79876 


Hs.44926 


dtpepfidytpeplidase IV (CD26, adenosine 


1628 


408000 


L11690 


Hs£20 


bullous pemphigoid antigen 1 (230/240kD) 


1554 


430226 


BE245562 


H&2551 


adrenergic beta-2-, receptor, surface 


ISM 


444484 


AK002126 


Hs.11260 


hypofhetieal protein FU11264 


14.76 


418601 


AA2794S0 


HSJ6368 


calmegin 


1456 


448999 


AF179274 


Hs22791 


transmembrane protein with EGF-fike and 


1455 


416182 


NM_004354 


Hs.79069 


cyeSnG2 


1234 


420544 


AA677577 


H&98732 


Homo sapiens Chromosome 16 BAC done CTT 


12.79 


445413 


AA151342 


Hs.12677 


CGH47 protein 


12.64 


453930 


AA419466 


HS36727 


hypothetical protein FU10903 


1222 


440286 


U295B9 


Hs.7138 


cholinergic receptor, muscarinic 3 


12j04 


452784 


BE463857 


Hs.151258 


hypothetical protein FU21062 


11J86 


450203 


AF097994 


HS301528 


L-l^raninefatpha-aniinoadqpate aminotra 


11.68 


448045 


AJ297436 


H&20166 


prostate stem cell antigen 


1151 


449650 


AF055575 


HS23838 


calcium channel, voltage-dependent, L ty 


11.18 


420381 


050640 


Hs.337616 


phosphodiesterase 3B, cGMP-inhibited 


11.10 


425665 


AK001050 


Hs.159066 


hypothetical protein FUI10188 


11J08 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


11J33 


428728 


NMJ016625 


Hs.191381 


hypothetical protein 


11:04 


407021 


U52077 




gb:Human marlnert transposase gene, comp 


11j02 


410733 


D84284 


Hs£6052 


CD38 antigen (p45) 


11j02 


452340 


NM_002202 


H&505 


ISL1 transcription factor, UM/homeodoma 


10.85 


428819 


AL1 35623 


Hs.193914 


KIAA0575 gene product 


10.48 


421991 


NMJ14918 


Hs.110488 


KIAA09S0 protein 


1054 


431217 


NM 013427 


Hs250830 


Rho GTPase activating protein 6 


9.75 


421470 


R27496 


Hs.1378 


annexin A3 


9.64 


409262 


AK000631 


Hs52256 


hypothetical protein FU20624 


9.45 


435980 


AF274571 


Hs.129142 


deoxyribonudease II beta 


924 


421246 


AW582962 
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Table 7: 42 GENES ENCODING SMALL MOLECULE TARGETS UP-REGULATED IN 
PROSTATE CANCER COMPARED TO NORMAL ADULT TISSUES 

5 Table 7 shows 42 genes up-regulated in prostate cancer compared to normal adult tissues that 
are likely to be small molecule targets. These were selected as for Table 5 and the predicted 
protein contained a structural domain that is indicative of a drugable structure (e.g. protease, 



BcAccrc Exemplar Accession number, Genbank accession number 

UnlgenelD: Unlgene number 

Unlgena Title: Unigene gene title 

PSDomain: Protein Structural Domain 

15 B1: Ratio of tumor vs. normal tissue 



Pkey ExAccn UnlgenelD Unlgene Title PSDomaln R1 

20 426747 AA535210 Hs.171995 kallikreln 3, (prostate specific antigen trypsin 3130 

40Q299 X07730 Hs.171995 kaliikreln 3, (prostate specific antigen trypsin 2451 

420757 X78592 Hs59915 androgen receptor (dirydrotestosterone r Androgen_recep,hormone_rec^f-C4 19.72 

408430 S79876 Hs.44926 dipeptidylpeptidase IV (CD26, adenosine DPPIV_N_termJ'eptidase_S9 1628 

430226 BE245562 Hs5551 adrenergic, beta-2-, receptor, surface 7tm_1 15.40 

25 411096 U80034 . Hs.68583 mitochondrial intermediate peptidase Peptidase_M3 1431 

440286 U29589 Hs.7138 cholinerak: receptor, muscarinic 3 7tm_1 1254 

420381 D50640 Hs537616 phosphodiesterase 3B, cGMP-lnhibited PDEase 11.10 

407021 U52077 gbdHuman marinerl transposase gene, comp SET,Transposase_1 1152 

401424 arginase 958 

30 410001 AB041036 Hs57771 katlikreln 11 trypsin 953 

428330 L22524 Hs.2256 matrix metalloprateinase 7 (matrilysin, Peptidase_M10 8.76 

424099 AF071202 Hs.139336 ATP-binding cassette, sub-family C (CFTR ABCJran,ABC_membrarta 754 

419991 AJ000098 Hs.94210 eyes absent (DrosophUa) homolog 1 Hydrolase 720 

431992 NMJ02742 Hs5891 protein kinase C, mu pWnase,DAG_PE-b!nd^H 6.49 

35 447359 MM.012093 Hs.18268 adenylate kinase 5 adenylateldnase 650 

400301 X03635 Hs.1657 estrogen receptor 1 OesLrecep^MM.hoimonejec 5.78 

421685 AF189723 Hs.106778 ATPase,Ca4+ transporting, type 2C,memb E1-E2_ATPase,Hydrolase 557 

444042 NM_t»4915 Hs.10237 ATP-binding cassette, sub-family G (WHIT ABC_tran 531 

447752 M7370O Hs.105938 lactotransfem'n transfemn,7tm_1 529 

40 407945 X69208 Hs.606 ATPase, Cu++ transporting, atpha potypep E1-E2_ATPase,Hydrotase,HMA 558 

403047 trypsin 451 

427617 D42033 Hs.199179 RAN binding protein 2 Ran_BP1 ,zf-Rar£P,TPR,proJsomerase 4.88 

422083 NMJJ01 141 Hs.111256 arachidonate 154ipoxygenase, second typ GpoxygenasePLAT 452 

449535 W15267 Hs53672 low density lipoprotein receptor-related MljBcepLb*iLrBcepLa,EQF 452 

45 425071 NM_013989 Hs.154424 deiodinase.iodothyronine.typell T4_deiodinase 452 

423740 Y07701 Hs593007 amlnopepfidase puromydn sensitive Peptidase_M1 454 

424701 NMJM5923 Hs.151988 mitogetvactivated protein kinase kinase ptdnase 451 

424085 NM_002914 Hs. 139226 replication factor C (activator 1)2 (40 AAA,Vral_haticase1 450 

417531 NM_003157 Hs.1087 serine/threonine kinase 2 pkinase' 4.12 

50 428695 A1355647 Hs.189999 purtnerglc receptor (family A group 5) 7tm_1 351 

410011 AB020641 Hs57856 PFTAIRE protein kinase 1 pkinase 3.91 

424850 AA151057 Hs.153498 chromosome 18 open reading frame 1 ldl_recept_a 352 

412350 AI659306 Hs.73826 protein tyrosine phosphatase, non-recept Y_phosphatase,Band_41 ,PDZ 3.70 

447397 BE247676 Hs.18442 E-1 enzyme Hydrolase 3.68 

55 452946 X95425 Hs.31092 EphA5 EPHJbd,fn3,pl<inase,SAM 3.66 

427144 X95097 Hs5126 vasoactive intestinal peptide receptor 2 7tm_2 3.65 

443775 AP291664 Hs504732 matrix metaHoproteinase 26 PepBdase_M10 356 

457940 AL360159 Hs.306517 Homo sapiens TRIpartile motif protein ps SPRY,7lm_1 352 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (isofo AJeamlnase 351 

60 413435 X51405 Hs.75360 carooxypepBdase E Zn_caibOpept 3.46 

447210 AF035269 Hs.17752 phosphatidylseifne-speBificphosphoIipas lipase 3.46 
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TABLE 8: 136 GENES SIGNIFICANTLY DOWN-REGULATED IN PROSTATE 
CANCER COMPARED TO NORMAL PROSTATE 

Table 8 shows 136 genes significantly down-regulated in prostate cancer compared to normal 
5 prostate . These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 2. Hie "average" normal prostate level was set to the mean amongst 4 
normal prostate tissues. The "average" prostate cancer level was set to the 85 percentile 
amongst 73 tumor samples. In order to remove gene-specific background levels of non- 
10 specific hybridization, the 10 th percentile value amongst all the tissues was subtracted from 
both the numerator and the denominator before the ratio was evaluated. 

Ptey: Unique Eos pnobeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

15 UnigenelD: Unpens number 

Unigene Title: Unigene gene title 

R1: Ratio ol normal prostate to prostate cancer 



20 



Ptey ExAccn 



R1 



425932 M81650 Hs.1968 semenogeflnl 57.69 

425545 N98529 Hs.158295 Human mRNA for myosin tight chain 3 (MLC 19.70 

4267S2 X69490 Hs.172004 fitin '52> 
442082 R41823 Hs.7413 ESTs; calsyntenin-2 

25 407245 X90568 Hs.172004 fitin £8 

422711 D60641 Hs21739 Homo sapiens mRNA; cDNA DKFZp586l1518 (f 9.05 

420813 X51501 Hs59949 proladn-hduced protein 8:18 

411987 AA375975 Hs.183380 "ESTs, Moderately similar to ALU7_HUMAN 7.45 

404567 5.62 

30 416030 H15261 Hs21948 ESTs 551 

444892 AK520617 Hs.148565 ESTs &Z7 

444573 AW043590 Hi225023 ESTs 5.20 

428068 AW016437 H&233462 ESTs 5.08 

437440 AA846804 Hs.123694 ESTs 455 

35 404113 f/Z? 

452279 AA236844 H&61260 hypothetical protein HJ13164 4.75 

421058 AW297967 Hs.188181 ESTs j-® 

445592 AV6S4382 Hs.17947 'ESTs, Weakly similar to K02F3.10 [C^ela 453 

405183 ^5? 

40 405227 4 - 45 
454059 NMJD03154H&37048 statherin 

450152 AI138635 Hs22968 ESTs 4/0 

407013 U35637 "gb:Human nebuGn mRNA, partial cds" 4.03 

403612 402 

45 440089 AA864468 Hs.135646 ESTs 4X0 

408988 AL119844 Hs.49476 Homo sapiens clone TUA8 Cri-du-chat regi 3.88 

436726 AA324975 Hs.128993 "ESTs, Weakly similar to KIAA04S5 protei 355 

459367 BE148877 , o^:C^HT0244-111199^)404i12HT0244Hom 195 

427318 AF186081 Hs.175783 zinc transporter 3-82 

50 411762 AWB60972 "obK)VO<;T0387-180300-167-h07CT0387Hom 355 

418668 AW407987 Hs.87150 Human done A9A2BR11 (CAC^/(GTG)n repea 3.75 

458311 AF069478 "gb:AF069478 Homo sapiens astrocytoma n 351 

403649 3^0 

419682 H13139 Hs.92282 paired-tike homeodomain transcription fa 358 

55 412519 AA196241 Hs.73980 "troponin T1, skeletal, stow" 351 

414206 AW276887 Hs.46609 ESTs 3.45 

427419 NM.000200HS.177888 histatin3 357 

420777 AA280223 Hs.130865 ESTs 355 

428134 AA421773 Hs.161008 ESTs 351 

60 450218 R02018 Hs.168640 "Ank, mouse, homolog or 350 

433474 AI192195 Hs.147174 "EST, Highly similar to ubiquttin-prolel 350 

418833 AW974899 Hs.292776 ESTs 326 

400440 X83957 Hs53870 nebuTin 3.16 
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413778 AAD90235 Hs.75535 "myosin, light polypeptide 2, regulatory 3-06 

423151 AW838068 "0bKaVHTOO4W)1OM)O-1(B-fO2LTOO48 Horn &05 

445060 AA830811 Hs58808 ESTs 2.98 

457065 AI476318 Hs.192480 ESTs 255 

5 432456 H00093 "gb:ph8f 12u_19/1TV Outward AliHirfmed tin 252 

405678 ^ 

' 406707 S73840 H&931 "myosin, hsavy polypeptide 2, skeletal m 2.81 

444105 AW189097 Hs.166597 ESTs 2.™ 

433968 AL157518 HsS0421 PR02463 protein z - 73 

10 438522 AA809431 HS558886 ESTs 273 

436562 H71937 Hs.169756 "oomptemant component 1 , s subcomponent" 2.68 

412417 AA102268 Hs42175 ESTs ?£7 

455590 BE072259 "gb:QV4*TO538-271299-059^04BT0536Horn 255 

415380 F07853 Hs.16085 putaSva G-protein coupled receptor 2.65 

15 428729 AL162331 Hs.1 91436 hypothetical protein FU10619 2.64 

408537 AW207734 "gb:UI-H-BI2-ag9-b-01-r>Ul.s1 NC1_CGAP_S 2.63 

424706 AA741336 Hs.152108 transcripfionalunitN143 2.63 

413212 BE072092 *Qb:PM4-BT0532-16Q20*003-b11 BT0532 Ham 2.63 

406704 M21665 Hs529 "myosin, heavy polypeptide 7, cardiac mu 2.62 

20 437507 AA758538 Hs546882 ESTs 2.60 

410384 AB33794 H&42745 ESTs 258 

408074 R20723 Hs.124764 ESTs 258 

436653 AA829828 H&292402 ESTs 252 

458090 AI282149 HS56213 "ESTs, Highly similar to FX03_HUMAN FORK 2-51 

25 432003 AI689154 Hs.122972 ESTs 2-50 

436915 AA737400 Hs.142230 ESTs 250 

410028 AW576454 H&25B553 ESTs 2.46 

448920 AW408009 Hs52580 aikylglycerane phosphate synthase 2.45 

422046 AI638562 "gb:ts50a10.x1 NCIjCGAPJJtl Homo sapiens 2.44 

30 451122 AA015767 Hs.193587 ESTs 2-*j 

422646 H87863 Hs.151380 ESTs f-» 

451237 AW600293 'gb:EST00049 pGEM-T Unary Homo sapiens 256 

400001 AFFX control: BioB-3 256 

415835 245385 "gb:HSC2NF0St normalized Infant brain cO 256 

35 439706 AW872527 Hs59761 ESTs 258 

423341 AW242394 HS552495 ESTs 258 

436486 AA742221 Hs.1 20633 ESTs 25* 

407449 AJ002784 gfcHomo sapiens mRNA; fetal brain cDNA 5 253 

430573 AA744550 Hs.1 36345 ESTs 252 

40 401974 251 

443356 AUM4498 Hs.133262 "ESTs, WeaWy similar to PHQ217 reverse 251 

430751 NM_012471Hs547868 transient receptor potential channel 5 . 255 

439128 AI949371 Hs.153089 ESTs 255 

448765 R15337 Hs51958 "Homo sapiens cDNA HJ10532 Us, done N 255 

45 451130 AI762250 HS511347 ESTs 254 
405420 

455029 AW851258 ■o>:ll3^0220-1602(XW66-H06CT0220Hom 253 

438224 AA933999 "gb:on91f04.s1 Soaras_NRJT_GBC_S1 Homo 253 

407764 BE008347 "gb:(mBW154-O80400-3254K>4BNO154Hom 253 

50 413549 BE252470 "gb«I1108292F1 NIH_MGC_16 Homo sapiens 253 

437010 AA741368 Hs591434 ESTs 253 

435111 AI914279 Hs513740 ESTs 252 

403375 25] 
455060 AW853441 ■gb:RC1£TO252-O30100-O23-g09 CT0252 Horn 251 

55 409792 AW854153 "gbflC^CT0254-06040(H)29-d03 CT0254 Horn 250 

421154 AA284333 Hs587631 "Homo sapiens cDNA FU14269 fis, dons P 2.19 
401963 2.18 
435034 AF168711 Hs.159397 xOIOproteln 2.18 
448996 AW998989 Hs.105749 K1AA0553 protein 2-18 

60 436816 AW297599 Hs555667 ESTs 2.17 
442252 AI733395 Hs.129124 ESTs 2.17 
419310 AA236233 Hs.188716 ESTs 2.16 
418579 H91800 Hs.124156 ESTs 2.16 
423315 R54109 Hs56096 ESTs 2.16 

65 432744 AA988835 Hs58664 ESTs 2.15 
424492 AI133482 Hs.1 65210 ESTs 2- 15 
424770 AA425562 ■gb:zw46e05.rt Soarestotalfetus_Nb2HF8 2.15 

437101 AA744518 Hs.12O610 ESTs 2.15 
428793 AC004957 Hs598975 "ESTs, Highly similar to collapsln-2-tik 2.15 
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415708 H56475 "gb:yt87d11.r1 SoarasjlnaaLsla^-NSHPQ 2.13 

459619 ?/H 

427506 AK000134 Hs.179100 hypoftatical protBtn FU20127 2-12 

452508 AA804174 Hs.184354 ESTs , *■]<> 

410881 AW809157 "$:RC0-ST011&O41093-C31-c07J ST0118 Homo sapiens cDNA, mRNA sequence" 2.10 

403087 * «32 

403869 z,1 ° 
445028 D81194 H&2824S9 ESTs „ , ™ 
447884 H29505 "gb:ym60d10.r1 Scares Infant brahlNIB Horn) sapiens eD^ckme?,mRNA sequence 2.10 

414575 H11257 Hs.295233 ESTs 2X9 

420351 BE218221 Hs.190044 ESTs 2*8 

426998 BE274360 *gb£01 121068F1 MH_MQC_20 Honx) sapiens cDNA dona 5", mRNA sequence 2-08 

405455 . 2X8 

423843 AA332652 "gb£ST36627EniHyo.8v^klHomosaplarBcONA5'endsimtotD^narto 

monoamine oxidase B, mRNA sequence" 2X8 



406135 



2.07 



427046 BE246180 Hs.121385 ESTs 2X7 

403493 " H 
444514 AI682905 Hs270431 "ESTs, Weakly similar to ALLUJiUMAN ALU SUBFAMILY J SEQUENCE 

CONTAMINATION WARNING ENTRY {H-saplensf 2.05 

435884 AA701443 Hs.192868 ESTs ^ 

419629 AB020695 HsX1662 KIAA0888 protein 2-03 
405900 

457350 AW974438 Hs.194136 "ESTs. Moderately similar to AF091457 1 zinc finger protein RIN ZF [Rnotvegicusr 202 

400007 AFFX control; BtoDn-5 2.01 

406978 M64358 "gbHuman rhom-3 gene, axon." 2X0 
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TABLE 8A shows the accession numbers for those primekeys lacking a unigencID in Table 
8. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank EST^s 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Play: Unique Eos probesst WenSflar number 

CAT number Qene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accessions 

407764 1014849 1 BE008347 BE008320 BE083307 BE033311 AWD7SS88 

408537 1064753 1 AW207734 D60164D81150D81078 061356 AW996804 

409792 1 154677 1 AW854153 AW500210 BE145772 AW501310 

410881 1225682.1 AW809157 AWB12181 AW812175 AW812172 AW812161 AW812165 

411762 1256906 1 AW860972 AW862593 AW862599 AW850988 AW860983 AW860838 AW860925 AW860922 AW860986 AW860984 AW860989 

413212 1353792J BB072092 BE072106 BE072086 BB072098 BE072103 

413549 1375933J BE252470 BE147573 

415708 1548209J H56475 R9401 F34552 

415835 155851 1_1 Z45355 R25905 H05203 T77496 

422046 210744 1 AI638562T16929H13401 F07773R55836 

423151 225415J AW838068 AW837986 AW838067 AA322487 AW837936 

423843 232510 1 AA332652 AA331633 AW999369 AYV902993 BE170475 AA378845 AW964175 AM75221 

424770 243504J AA425562 AIB8Q208 AA346646 N22655 AW811775 AW811786 

426998 274259 -1 BE274360 

432456 347718J H00093 H00079 H00070 H00054 H00049 H00063 AW905306 AW905241 AW905410 AW905307 AW905411 AW905240 
AW905210 

AW905352 AW905304 AW905239 AW905242 AW905243 H00087 

438224 452656 1 AA933999 AA781181 

447884 740749 1 H29505 R18575 Z43580 T48738 AI435454 BE004683 

451237 863269 1 AW600293 AI76746B 

455029 1249374J AW851258 AW851435 AW851106 AW851421 

455060 1251259 1 AW853441 BE145228BE145218BE145162BE145283 

455590 1335127 1 BE072259BE072230BE007911 

458311 543550J AF069478 AF069479 AF059480 
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TABLE 8B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in table 8. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey. 
Ret 



Unique number corresponding to an Eos probesat 

Sequence source. The 7 digR numbers in fliis column are GertoiftWen^fGOmimbers. Dunham Let A refers to the 
publication enfffled "The DNA 

sequence of human chrornosorne 22." Dunham L el at. Nature (1999) 402*489-495. 



Strand: 




Indicates DNA strand from which exons 


Ntjosition: 


Indicates nuclsoSde 


positions of predict! 


Pkey 


Re! 


Strand 


Ntposifion 


401963 


3126783 


Plus 


51382-51521 


401974 


3126777 


Plus 


8533045683 


403087 


8954241 


Plus 


169511-169795 


403375 


9255944 


Minus 


9255442795 


403493 


7341425 


Plus 


157568-159084 


403612 


8469060 


Minus 


9472344859 


403649 


8705159 


Mtous 


27141-27247 


403869 


7280046 


Minus 


3437944583 


404113 


9588571 


Minus 


13446-13646 


404567 


7249169 


Minus 


101320-101501 


405163 


9966267 


Minus 


161171-161299 


405227 


6731245 


Minus 


22550-22802 


405420 


7211837 


Minus 


13428-13582 


405455 


7656675 


Plus 


134112-134671 


405678 


4079870 


Plus 


151821-152027 


405900 


6758795 


Minus 


71181-71535 


406135 


9164918 


Minus 


6548945715 
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TABLE 9: 1001 GENES SIGNIFICANTLY UP-REGULATED IN NORMAL PROSTATE 
COMPATED TO PROSTATE CANCER 

Table 9 shows 1001 genes significantly up-regulated in prostate cancer compared to normal 
prostate. These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater man or equal to 8.14. The "average" normal prostate level was set to the mean 
amongst 4 normal prostate tissues. The "average" prostate cancer level was set to the 85 
percentile amongst 73 tumor samples. In order to remove gene-specific background levels of 
non-specific hybridization, the 10 th percentile value amongst all the tissues was subtracted 
from both the numerator and the denominator before the ratio was evaluated. 



Pkey: 

ExAccn: 

UnlgenelD: 



R1: 



Unique Eos probeset Identifier number 

Exemplar Accession number, Genbank accession number 

Unlgene number 

Unlgene gene tale 

Ratio of prostata cancer to normal prostate 



Pkey ExAccn UnlgenelD Unlgene Title 



451002 AA013299 
435596 AA689465 
443576 AI078027 
434247 AA928116 
400452 AK000185 
405932 

427906 AAS64330 
443685 AI686550 
451554 AI474866 
418323 NH.002118 
429480 M36860 
426025 AW138330 
418917 X02994 
404407 

442027 AI652926 
433704 AA60B684 
453758 U83527 
415354 F06495 
424239 M67439 
444143 AW747996 
401672 

430590 AW383947 
411972 BE074959 
448992 AI766053 
408828 BE540279 
409653 AW451693 
402964 

422673 N59027 
422568 AA372275 
438907 R32704 
405172 

444897 AW137088 
458019 AW592931 
405275 AB028989 
457815 AA703679 
424385 AA339666 
407172 T54095 
428202 AA424163 
435672 AI700148 
420283 AA485224 
417016 AA837098 
438854 AF074S94 



H&8018 ESTs, Weakly similar to ALU3_HUMAN ALU S 
Hs.188999 ESTs 
Hs.169333 ESTs 
Hs272065 ESTs 

gb:Homo sapiens cDNA FU20178 lis, done 

Hs.166520 ESTs 

Hs.174481 ESTs 

Hs.193237 ESTs 

Hs.1162 major histocompaiibiSty complex, class 

Hs.9295 elasfin (supravalvular aortic stenosis, 

HS233778 ESTs 

Hs.1217 adenosine deaminase 

Hs.128395 ESTs 

Hs.121705 ESTs.ModerateIysirnilartoAUJC_HUMANI 
gb:HSU83527 Human fetal brain (M-Lovett) 
gb«SC1AB051 normalized kifant brain cON 

Hs.143526 dopamlnB receptor D5 

Hs.160999 ESTs 

Hs246381 CD68 antigen 

gbPMO-BT0582-310100-001-f08 BT0582 Homo 
Hs.188346 ESTs 

gb.-601059857F1 NIH_MGC_10 Homo sapiens c 
Hs520826 ESTs 

gb:yv59d1 1 A Soares fetal Ever spleen 
Hs579800 Homo sapiens cDNA RJ1 1383 tfe, done HE 
Hs.301298 ESTs 

Hs.144857 ESTs 
Hs.256298 ESTs 

Hs.88500 mitogen-activated protein kinase 8 inter 

Hs.106999 ESTs, Weakly similar to SYT5_HUMAN SYNAP 
gbf ST44776 Fetal brain I Homo sapiens c 
gb.-ya92c05.s1 Stratagene placenta (93722 

Hs.156895 ESTs 

KS283626 ESTs 

Hs57734 G protein-coupled receptor kinase-intera 
HS269933 ESTs 
Ks24240 ESTs 



R1 

1684.00 

738.00 

24656 

24520 

222.00 

22153 

zvua 

16320 

149,45 

126.11 

12327 

120.00 

106.75 

105.71 

10053 

9450 

89.18 

87.73 

8652 

86.43 

7726 

6847 

68.00 

6128 

57.71 

56/40 

5457 

54.00 

5450 

5256 

5258 

5252 

51.63 

5058 

4950 

4850 

4758 

4653 

4357 

43.00 

42.70 

42.67 



174 



WO 02/30268 



PCT/US01/32045 



406134 4143 

457319 AA480835 H&201552 ESTs, Weakly stndar to T17288 hypoflietj 42X1 

409314 AA070268 gb2m69d04 x\ Stratagene neuroepWwEum 4225 

401124 41X1 

5 429316 AB71157 Hs.178538 ESTs 40X0 

420317 AB006628 Hs.96485 KIAA0290 protein 39X4 

457588 AWDS2439 gb«RfrCT0060-120899-001-f08 CT0060 Homo 39j80 

417407 AA923278 Hs290905 ESTs, Weakly similar to protease [H.sapl 38.73 

430269 BE221682 Hs.178364 ESTs 38X8 

10 439602 W79114 H&58558 ESTs 36X9 

433686 AA6M799 Hs.136528 ESTs, Moderately similar to ALD1_HUMAN A 3629 

417893 AW963705 H&295806 ESTs, WeaMy similar to ALU7JflJMAN ALUS 36.18 

428214 AA936282 Hs.120397 ESTs 36.10 

416908 AA333990 HsX0424 coagulation factor XIII, A1 por/papWa 36j08 

IS 426264 BE314852 Hs.168694 hypofteBcal protein FU10257 36X0 

415911 H0B796 Hs.124952 ESTs 38X0 

457502 AA076049 Hs274415 Homo sapiens cONA FU1Q229 lis, done HE 3523 

421566 NM.000399 Hs.1395 early growfli response 2 (Knw-20 (Drosop 3520 

401468 34X9 

20 458561 AI220150 Ks211195 ESTs 34X0 

433601 BE350738 Hs.123993 ESTs, Weakly similar to T00365 hypotheB 3324 

454977 AW848032 gb-JL3OT0214-231299O534)11 CT0214Hon» 32X6 

402828 32X3 

414522 AW518944 Hs.76325 Homo sapiens cDNA:FU23125 fis, clone L 31.76 

25 402842 31X8 

421245 AA285363 gb:HTH28Q HTCDL1 Homo saptens cDMA 513 31X9 

401631 F05I83 Hs.1799 CD1Dan8gen,d polypeptide 3126 

408057 AW139565 gb:UI-H-BI1-aea-<M4*Ul.s1 NCLCQAPjSu 3124 

408069 H81795 gb:ys68a1 0/1 Soares retina N2>4HR Homo 3120 

30 438694 T87479 Hs291797 ESTs 31X9 

449156 AF1039O7 Hs.171353 prostate cancer antigen 3 2978 

428796 AU076734 Hs.193665 solute carrier famSy 28 (sodium-coupled 2976 

452549 AI907039 gb:PM-BT134-020499-566 BT134 Homo sapien 29X9 

410129 BE244074 HS285531 regulator of Fas-Induced apoptasis 29X3 

35 414464 AI870175 Hs.13957 ESTs 29.47 

412326 R07566 Hs.73817 Small inducible cytokine A3 (homologous 2922 

459081 W078O8 gbzb03a12j1 SoaresJetaIJung_NbHL19W 2920 

448702 AW1Q2670 Hs.122464 ESTs 29.13 

451939 U80456 Hs27311 single-minded (Drosophlla) homolog 2 28.74 

40 443412 W84893 Hs.9305 angiotensin receptor-like 1 28X1 

457324 AB028990 Hs243901 KIAA1067 protein 2824 

424247 X14008 Hs234734 lysozyme (renal amyloidosis) 28.18 

457140 AI279960 Hs.178140 ESTs 28.12 

444151 AW972917 Hs.128749 alpta-rrethylacyKtoA racemase 28X6 

45 457669 AW104257 Hs.123426 ESTs, Weakly similar to putative serine/ 27X1 

412429 AV650262 Hs.75765 GR02 oncogene 27X6 

405495 27^3 

406516 2725 

407997 AW135429 Hs243577 ESTs 26X6 

50 442115 AW452332 HS257S54 ESTs 26X6 

409038 T97490 HsX0002 small Inducible cytokine subfamily A (Cy 26X4 

402838 2632 

449846 AI979284 HS200552 ESTs " 2621 

417153 X5701O HsX1343 collagan, type II, alpha 1 (primary osta 2620 

55 439792 NM.014856 Hs.6684 KlAA0476gana product 25X1 

450098 AI682088 Hs223368 ESTs 25X0 

424196 AL133660 Hs.142928 Homo sapiens mRMA; cDMA DKFZp434M0927 (f 25X7 

414246 BE391090 Hs280278 EST 25X7 

420848 NM_005188 HsX9980 Cas-Br-M (murine) ecotropio retroviral t 2548 

60 424778 AA251048 Hs.153042 lymphocyte antigen 9 25.42 

409126 AA063426 gb:zf70c08.s1 SoaresjjlneaLflland_K3HPG 2525 

443936 AW083491 Hs.31196 ESTs 2522 

419392 W28573 gbSlflO Human retina cONA randomly prim 25X1 

411201 T74588 HsX509 ESTs, Weakly similar to C03_HUMAN COMPLE 24X5 

65 422940 BE077458 ob:RC1-BT060fr09050CH)15^)04 BT0606 Homo 2476 

437571 AA760894 Hs.153023 ESTs 2474 

433973 A1014723 Hs.131770 ESTs 24X7 

422416 BE019557 Hs.11900 Human DNA sequence from done RP4-583P15 24X3 

421552 AF026692 Hs.105700 secreted nteled-related protein 4 24.49 
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443668 
424800 
453633 
430565 
433694 
451045 
408583 
444040 
414182 
418678 
408380 
456076 
418299 
444917 
444381 
415788 
410396 
412978 
458418 
454791 
408748 
416011 
440474 
447047 
426793 
409841 
405685 
457359 
423057 
422355 
401201 
458278 
439097 
414875 
35 400926 
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U25758 
AL035588 
AAS57001 
AL122081 
A1208611 
AA215672 
AW449674 
AF204231 
AA136301 
NM.001327 
AF1230S0 
BE243877 
AA279530 
R68551 
BE387335 
AW628686 
AW809637 
AM3170B 
AV653846 
BBJ71874 
J05500 
H14487 
AI207936 



451355 
446982 
417105 
405777 
424123 
425009 
443271 
421064 
418819 
457595 
404426 
412571 
431457 
414002 
418994 
437158 
437866 
417421 
433057 
421730 
456557 
440306 
439345 
416155 
437820 
450923 
418329 
424537 
447742 
415251 
440770 
407711 
427157 
409847 



Hs.134584 

Hs.153203 

H&34045 

Hs244343 

Hs.12066 

Hs.47359 
Hs-182982 

Hs.167379 

Hs.44532 

Hs.76941 

Hs33968 

Hs.144997 

HS283713 

Hs.78851 

H&820 
Hs.126261 

Hs.47431 

Hs.7195 

Hs246306 

Hs.172350 



ESTS 

MyoO family Inhbitor 
hnraStafical protein FU20764 
cadherin related 23 

Homo sapiens cONA RJ1 1720 Bs, done HE 
gbar96e09.s1 NCLCGAP.GCB1 Homo sapiens 
ESTs 



gb2k93g04.s1 Soares_pregnanLuterus_KbH 



AW502139 

AI983207 
AA321355 
AW403724 

W28912 
H66948 
H42679 

NM.004197 

AW500221 

X60992 

AW966158 

X58288 

BE568568 

AI245432 

AA228776 

AA584854 

U43143 

NMJ012211 

NMJ006732 

AA296520 

AW090198 

AA156781 

AL138201 

X15675 

AW449808 

AA284477 

AK47422 

AL355743 

AI807264 

AA769062 

AW043951 

AW247430 

AI673027 

API 13925 

R42863 

AA912B15 

AI085846 

U51166 

AW501751 



dtublqu2k\ 

ATPase, NaWK+ transporting, beta 3 poly 
Integral, beta 2 (antigen C018 (p95), ly 
ESTs 
ESTs 

K1AAQ217 protein 

flb:MR4-ST0124-281099-015-b07 ST0124 Homo 
homeo boxC8 

Homo sapiens Chromosome 16 BAC done CTT 
gb:RC2-8T0522-12Q200-014-a06 BT0522 Homo 
spectrin, beta, erythrocytic (tndudss s 
gb:ym18c10j1 Scares Infant brain 1NIB H 
rjamrna-arninobutyrlc add (GABA) A recepto 
Homo sapiens cDNA: RJ23529 fls. clone L 
HIR (histone cell cycle regulation defeo 
gb^HF*R0p#-«<)5MHJlJl NIH_MGC_5 



Hs.192481 ESTs, WeaHy similar to SYPHJflJMAN SYNAP 
H&285401 ESTs 

Hs.140 rmmurtogtobufin heavy constant gamma 3 (Q 

Hs.129019 ESTs 

gkyrffidlOjl Soares fetal liver spleen 
Hs.77522 major histocompaailityoomplex,dass 

Hs.444 serine/threonine kinase 19 

Hs.43616 Homo sapiens mRNA for FUI00029 protein, 

Hs31226 C06 antigen 

H&58582 Homo sapiens cDNA HJ12702 fts> done NT 
Hs.154151 protein tyrosine phosphatase, receptor t 
Hs.195704 ESTs 

Hs.101382 tumor necrosis {actor, atpha-tnduoed pro 
Hs.191721 ESTs 

gbmo09h1U1 NCLCGAP_Phe1 Homo sapiens 



Hs.74049 

Hs256297 

Hs.75678 

Hs39546 

Hs.4779 

Hs.83992 

Hs.82120 

H&296832 

Hs.164036 

HSJ96618 

Hs.129966 

Hs36663 

Hs205442 

Hs.16029 

H&38449 

Hs34152 

Hs.143271 

Hs.19405 

Hs.7124 

Hs222078 

Hs25522 

Hs.173824 

Hs279733 



integrin, alpha 11 
FBJ murine osteosarcoma viral oncogene h 
SelecGn E (endothelial adhesion molecul 
KIM1150 protein 
ESTs 

nudear receptor subfamily 4, group A m 
Human pTR7 mRNA for repetitive sequence 
glucosamine (N-aatyQ-6-sulfatase (Sanf 
ESTs 
ESTs 

Homo sapiens EST from done 41214, fuD 
ESTs, Weakly sMartoAF1176101 Inner 
ESTs, Weakly similar to alternatively sp 
ESTs 

cystafhiofline-beta-synfhase 
ESTs 

caspase recruitment domain 4 

ESTs 

ESTs 

ESTs 

trrymha-DNAglycosytasa 
ESTs 



2449 
24.10 
24J04 
24.00 
2339 
2333 
23.73 
2332 
2339 
2020 
22J88 
22j65 
2238 
2226 
22J08 
2234 
22jOO 
2155 
2154 
21.84 
2126 
2124 
21.14 
21.11 
21.10 
21j07 
2050 
2034 
20.74 
20.73 
20.73 
2038 
2037 
20.66 
2036 
2034 
2031 
2031 
2031 
2020 
20.10 
1938 
1938 
1934 
1930 
1934 
19.79 
1932 
1937 
1936 
1932 
19.44 
■ 1934 
1922 
1921 
1177 
18.76 
1835 
1834 
1832 
1839 
1838 
1835 
1832 
1847 
1840 
1832 
1828 
18.15 
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417240 N57558 
435732 AF229178 
436896 AW977385 
432485 N90836 
428430 AI971131 
429984 AU050102 
449214 AI889114 
433857 AK000596 
431735 AW977724 
401515 

444045 AI097439 
442754 AUM5825 
426559 AB001914 
432415 T16971 
427829 AI188225 
432516 R08003 
435259 AA152106 
414989 T81668 
444880 AW118683 
417651 R06674 
453457 AI537103 
424246 AW452533 
419078 M93119 
417696 BE241824 
431117 AF0B3522 
455254 AW877015 
425782 U66468 
426678 H08170 
426403 NM_000361 
425905 AB032959 
438867 AW451157 
420940 AA830664 
459234 At940425 
404756 

422247 U18244 
420568 F09247 
443559 AI076765 
438703 AI803373 
411424 AW845985 



NM.006441 
AW449802 
AB002367 
AW451955 
AW190902 
R23534 
AB018319 
AA047854 
A1080042 
AA534908 
AA847856 
AW1 35221 
AW796342 
AUJ49810 
NM003816 
AI357412 
BE281591 
AA055800 

446012 
409671 
405934 
426108 
416208 
410708 
447342 
454563 
411507 
438170 
416292 



422538 
447108 
448520 
438567 
407811 
410721 
437133 
408182 
417315 
431840 
439382 
418277 
410S88 
420120 
429597 
447033 
421684 



AA076769 

AAS22037 

AW291168 

AA534370 

AI199268 

AW807530 

AW850140 

AJ916685 

AA179233 



Hs.176028 EST 

Hs.123138 leucine rich repeal and death domain con 

H&278615 ESTs 

Hs576770 CDW52 antigen (CAMPATH-1 antlgan) 

H&293684 ESTs, WeaWy stoiarto alternatively sp 

H&227209 DKFZP586F1019 protein 

Hs.195663 ESTs 

Hs5618 Wppocalein-tlte 1 

Hs.75968 thymosin, beta 4, X chromosome 

Hs.135548 ESTs 
Hs510197 ESTs 

Hs.170414 paired basic amino add deavlng system 
H&289014 ESTs 
Hs.127462 ESTs 
Hs.188013 ESTs 
Hs.4859 cycSnLania-6a 

gb^d29o04/1 Scares fetal liver spleen 
Hs.154150 ESTs 
H&268628 ESTs 

H&270599 ESTs, WeaMy similar to unnamed protein 

Hs.143604 KaiSO 

HsjB9584 tosuBnoma-assodated 1 

Hs52401 C069 antigen (p60, early T-caS acfivari 

Hs550500 delta (Drosophlla>flkBl 

gbXJV2-PT0010-25O300O96fl2 PT0010 Homo 
Hs.159525 cell growth regulatory with EF-hand doma 
Hs.1 13755 ESTs 
H&2030 thrornbomodulln 
Hs.161700 KIAA1133 protein 
Hs.181157 ESTs 
Hs.143974 ESTs 

gb£M0<rT0052-150799O24-c04 CT0052 Homo 

Hs.1 13602 solute carrier family 1 (high affinity a 
Hs.167399 protocadhertnatoha5 
H&269899 ESTs 
Hs51599 ESTs 

gbflC2-CT0163-200999-002-HOB CT0163 Homo 

Hs.1 18131 5,10-rneriwriyltetraliydrofolate synthetase 
Hs517953 ESTs, Moderately slrrflar to NK-TUMOR REC 
H&21355 doubfecortin and CaM Hnase-IIkfl 1 
Hs.153065 ESTs 

Hs.40098 cysteine knot superfamily 1, BMP antagon 
Hs5730 heterogeneous nuclear ribonudeoprotain 
Hs5460 KIAA0776 protein 

gb3f49g04 jl Scares retina N2b4HR Homo 
Hs.1 80450 ribosamal protein S24 
H&2860 POU domain, class 5, transcription facto 
Hs.124565 ESTs 
Hs.130812 ESTs 

gb:PM2-UM0027-23020OO02-h02 UM0O27 Homo 
Hs.95243 transcription elongation factor A (SU)- 
Hs5442 a disintegrin and metalloprotelnase doma 
Hs.157601 EST-notinUnrGene 
Hs.106768 hypothetical protein RJ10511 
H&222933 ESTs 

Hs.172382 hypothetical protein RJ20001 

gb:7B02B10 Chromosome 7 Fetal Brain cDNA 

Hs.166468 programmed cell death 5 
Hs.41295 ESTs 

Hs.154088 Homo sapiens cDNA: FU22756 (Is, done K 
Hs.19322 ESTs; Weakly similar to l!il ALU SUBFAMI 

gbCM0-ST0081-130999-054-d02 ST0081 Homo 
gb:IL3-CT0219-261 099-023011 CT0219 Homo 
Hs.194601 ESTs 

Hs.42390 nasopharyngeal carcinoma susceptibility 



18.13 

18.12 

18.12 

1750 

17-82 

17.82 

17.75 

17.72 

17.71 

17jB7 

17J58 

1755 

1754 

17.50 

1750 

17-44 

17-36 

17-31 

1750 

1757 

1752 

1752 

17.18 

17.14 

17.14 

17.14 

17.12 

17.12 

1751 

1750 

1658 

1654 

1652 

1S51 

1650 

1658 

1650 

16.78 

16.70 

1659 

1658 

16.65 

1654 

1652 

1650 

1650 

16.40 

1652 

1650 

1658 

1650 

1659 

16.04 

16.04 

16.02 

16.02 

15.94 

1553 

1556 

1555 

1554 

1554 

15.48 

15.42 

1558 

1557 

1556 

1529 

1556 
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406638 M13861 gbWuman T-ceH receptor active bata-cha 1526 

446686 AW13B043 Hs.156307 ESTs 1525 

434485 A1523511 Hs.118567 ESTs 1524 

441188 AW292830 H&255609 ESTs 1522 

5 444172 BE147740 Hs.104558 ESTs 1522 

409521 BE244854 Hs.159578 Homo sapiens mRNA far FU00020 protein, 15.16 

420748 AA27S956 K&88672 ESTs 15.14 

422583 AA410506 Hs.1 18578 Rsapiens mRNA tor ribosomalprotainL18 15.14 

424240 AB023185 Hs.143535 caldum/oalmodulirKteperKlaril protein kin 15.12 

10 451118 AB62098 Hs.60640 ESTs 15.12 

437495 BE177778 flb:RC1-HT0598-310300-012-{(J7 HT0598 Homo 15.12 

445467 AK39832 Hs.15617 ESTs, WeaHyslniar to ALU4_HUMAN ALUS 15X6 

418305 AW006783 Hs.6686 ESTs 15X3 

402812 15X2 

15 436851 AA732480 Hs293581 ESTs 15X0 

400991 15X0 

415752 BE314524 Hs.78778 Human putafive transmembrane protein (nm 1456 

429900 AA460421 HS50875 ESTs 1450 

403683 14.84 

20 430315 NMJD04293 Ks239147 guanha deaminase 1450 

451952 AL120173 H&301663 ESTs 14.72 

424687 J05070 Hs.151738 matrix metaOaproteinase 9 (gelafinase B 1459 

447229 BE617135 gt£01441677F1 NIHJIGC.65 Homo sapiens c 14X7 

425818 AB021225 Hs.159581 matrix mstaltoproteinaso 17 (membrane-in 14X5 

25 448553 AK38449 Hs.173031 ESTs 14X3 

431089 BE041395 Hs283676 ESTs, Weakly similar to unknown protBln 14X0 

459145 AB03354 gb:RC-8T029-100199-117 BT029 Homo saplen 1455 

449650 AF055575 KS297647 ESTs, Moderately similar to calcium Chan 1454 

400952 14.46 

30 445885 AI734009 Hs.127699 EST cluster (not in UnlQene) 14.44 

407938 AA905097 Hs55050 phospholamban 14.42 

431676 AI685464 H&292638 ESTs 14.40 

437210 AA311443 Hs293563 Homo sapiens mRNA; cONA DKFZp5B6E2317 (i 1456 

451900 AB023199 Hs27207 WAA0982 protein 1456 

35 445800 AA126419 Hs501632 ESTs 1452 

412368 AW945992 Hs.181125 ImmunoglobuBn lambda locus 1451 

409055 AW304028 Hs500578 ESTs 1423 

408763 W57550 Hs501526 HomosaplenscONARJ13181 fis,donaMT 1422 

446734 AUM9278 Hs.16074 Homo sapiens mRNA; cDNA DKFZp554l153 (fr 1422 

40 413551 BE242639 Hs.75425 ublcpiSh associated protein 1422 

421913 AI934365 Hs.1 09439 osteogtycin (ostBolnduclive factor, mima 1422 

452712 AW838616 gbflC5^T0054-14020(KI13^M1 LT0054 Homo 1422 

451466 AW503398 Hs210047 ESTs 14.16 

406038 Y14443 Hs58219 zinc finger protein 200 14.14 

45 424909 S78187 Hs.153752 cell dMslon cycle 25B 14X7 

434078 AW880709 HS283683 EST 14X7 

415254 AI815831 Hs.184378 ESTs 14.05 

418196 AI745849 Hs26549 ESTs, Weakly similar to T00066 hypoflwfi 14.02 

410020 T86315 Hs.728 ribonudaasa, RNase A family. 2 (liver, 1358 

50 411352 NM_002890 Hs.758 RAS p21 protein activator (GTPase acfiva 1358 

429848 AF145439 Hs225946 chemoklne (C-C motil) receptor 9 1355 

413729 BE159999 gb^!4!T(W12-270300-l23^10HT0412Homo 13.90 
400125 * 1358 

420319 AW406289 Hs56593 hypottietlcal protein 1355 

55 448272 AM79094 Hs.170786 ESTs 1350 

422695 AA315158 gb:EST186956 HCC cell 6ne (matastasls t 1350 

424565 AW102723 Hs.75295 guanylate cyclase 1, soluble, alpha 3 13.78 

458048 H30340 Hs.173705 Homo sapiens cONA: FU22050 fis, done H 13.78 

408894 AI935400 Hs217286 ESTs 13.76 

60 454093 AW860158 gbflCO-CT0379-29010(W32-b04 CT0379 Homo 13.75 

410889 X91662 HsX6744 twist (Drosaphlla) homolog (acrocephalos 13.74 

457751 A1908238 gML-eT166-180399-010BT166 Homo saplen 13.72 

455131 AW857913 gbflC0-CT0323-231 199-031 -b05 CT0323 Homo 1359 

408354 AW015238 Hs.128453 ESTs 1357 

65 425907 AA365752 Hs.155955 ESTs 1352 

402359 13.60 

401044 1353 

409877 AW502498 Ks.157150 ESTs, Weakly similar to zinc finger prot 1353 

423690 AA329648 Hs23804 ESTs 13.49 
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430685 A169Q234 Hs.191666 ESTs, Weakly slmllarto reverse transcri 1347 

414052 AW578849 Hs283552 ESTs, Weakly similar to unnamed protein 1346 

447858 AW080339 H&211911 ESTs 1344 

43S716 AI573283 H&3845B ESTs 1344 

5 439120 H56389 gb:yt87c03.r1 Soare5_pineal_gland_N3HPG 1343 

402788 1340 

451591 AA886446 Hs.146278 ESTs 1340 

405411 13-38 

426558 AW188574 Hs24218 ESTs 1354 

10 453506 M132818 Hs.1 10407 ESTs, Weakly similar to coded for by C. 1353 

416445 ALO43004 Hs500878 Human serine/threonine kinase mRNA, part 1352 

457084 AI074149 Hs.150905 ESTs, Weakly similar to chondroitin 4-su 1332 

403838 1352 

427337 Z46223 Hs.176663 Fe fragment of IgG, low affinity lllb, r 1350 

IS 434318 AW207552 Hs.116328 ESTs, WeaMysImllarfodJ134E15.1 [H.sa 1328 

435193 N41359 H&218107 ESTs 1328 

414756 AW451101 Hs.159489 ESTs, Moderately sSmter to hexokinase I 1327 

420626 AP043722 Hs59491 RAS guanyl releasing protein 2 (caidum 1326 

420052 AA418850 Hs44410 ESTs 1325 

20 414020 NM_002984 Hs.75703 smal Inducible cytokine A4 (homologous 1325 

403851 1324 

422647 W07492 Hs.157101 ESTs 1321 

433598 AI762836 Hs271433 ESTs, Moderately similar to ALU2_HUMAN A 1321 

409065 AB033113 Hs50187 K1AA1287 protein 1320 

25 435063 R21S66 Hs57734 G protein-coupled receptor Idnase-intera 13.19 

439367 BE386844 Hs248746 ESTs 13.17 

451357 AI796320 Hs.10299 Homo sapiens cONA FU13545 fis, etona PL 13.16 

420569 AA278362 Hs28S062 Homo sapiens cDNA FU12334 fis, clone MA 13.14 

447883 BE262802 Hs4909 dickkopl (Xenopus raevb) homolog 3 13.07 

30 426490 NMJQ01621 Hs.170087 aryl hydrocarbon receptor 1306 

414789 AA155859 Hs.79708 ESTs 13:05 

451416 BE387790 Hs26369 ESTs 13.04 

443494 T99719 Hs270404 Homo sapiens cDNA: RJ22389 fis, done H 13X13 

425878 AW984806 Hs58085 ESTs, Weakly similar to putative glycine 13X12 

35 431912 AI660552 Hs.154903 ESTs, Weakly similar to A56154 Act subst 13XJ0 

407122 H20276 Hs51742 ESTs 13.00 

456491 AL137466 Hsj97277 Homo sapiens mRNA; cDNA DKFZp434H1322 (I 12X49 

448172 M75276 Hs.135904 ESTs 1258 

452144 AA032197 Hs.1 02558 ESTs 12X16 

40 419953 BE267154 Hs.125752 ESTs 1256 

416182 NMJD04354 Hs.79069 cydlnG2 12X44 

451154 AA015879 Hs53536 ESTs 1253 

412257 AW903830 p^CM44tN1O37-25(MO0-155-h04 NN1037 Homo 1253 

449784 AW161319 Hs.12915 ESTs 1252 

45 432695 063480 Hs278634 WAA014S protein 1252 

454105 NMJM1259 Hs58481 cydin-dependent kinase 6 12X42 

439093 AA534163 Hs.5476 serine protease inhibitor, Kazal type, 5 1250 

416098 H41324 Hs51581 ESTs, Moderately similar to ST1B_HUMAN S 1258 

424897 D63216 Hs.153684 frizzled-related protein 1258 

50 414604 AU076649 Hs.76556 growth arrest and DNA-damage-inducibla 3 1258 

414664 AA587775 Hs.66295 Homo sapiens HSPC311 mRNA, partial cds 1254 

452560 BE077084 gb:RC5-BT0603-220200-013-C07 BTO503 Homo 1254 

413869 NM_000878 Hs.75598 interleuldn 2 receptor, beta • 1250 

452359 BE167229 Hs29206 Homo sapiens clone 24659 mRNA sequence 1250 

55 435386 BE265839 Hs.12126 hepatocellular carcinoma-associated anti 12.78 

445230 U97018 Hs.12451 eohlnoderm mtaotebule-associated protel 12.78 

412226 W26786 gb:15d7 Human retina cONA randomly prima 12.77 

446619 AU076643 Hs513 secreted phosphoprctelnl {osteopontfn, 12.76 

447769 AW873704 Hs.48764 ESTs 12.76 

60 414478 AI306389 Hs.76240 adenylata kinase 1 12.76 

425383 D83407 Hs.156007 Down syndrome critical region gene 1-Ek 12.68 

450704 H85157 Hs.40696 ESTs 1256 

405856 12.66 

412935 BE267045 Hs.75064 tajbuDivspaclficchaperonec 1255 

65 402802 12.62 

452588 AA889120 Hs.1 10637 HomeoboxAlO 1252 

41S978 NM.001454 HS53974 forkhaadboxJI 12.62 

403137 1250 

430226 BE245562 Hs2551 adrenergic, bete-2-, receptor, surface 1257 
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448076 AJ133123 Hs20198 aderrylate cyclase 9 1256 

450462 F07097 Hs500828 Homo sapiens mRMAfuD length Insert cON 1254 
405236 1252 
409222 AAQ71051 gb:zm58e05.s1 Stratagene fibroblast (937 1247 

5 421540 AA767669 Hs.10242 ESTs 1247 

425840 AW978731 Hs501824 ESTs 12.44 

443181 AI039201 Hs54548 ESTs 12.42 

452436 BE077546 Hs51447 ESTs 12,42 
455183 AW984111 flbfl(X>WI0007-160300O11-f09 HN0007 Homo 1240 

10 432887 AB26047 Hs.162859 ESTs 1237 

410494 M36564 H&64016 protein S (afpha) - ' 1256 

439024 R96696 H&35598 ESTs 1256 

451246 AW189232 Hs59140 cutaneous T-ceil lymphoma tumor antigen 1256 

432892 AL042615 Hs.15995 ESTs 1255 

15 418982 AB48838 Hs.13073 ESTs 1255 

414516 AB07802 Hs278S51 ESTs 1254 

440134 BE410734 gb501301619F1 NfH_MGC_21 Homo sapiens c 1229 

443873 AL048S42 Ha.16291 ESTs 1228 

401286 1226 

20 454020 AW962845 HS256527 ESTs 1224 

420077 AW512260 Hs57767 ESTs 1224 

443837 AI984625 Hs5884 spindle pole body protein 1224 

407519 X64979 gbitsaplens mRNA HTPCRX01 for olfactory 1223 

435339 AF249744 Hs25951 Rho guanine nudeofide exchange factor ( 1222 

25 448552 AW973653 Hs20104 hypothetical protein FU0O052 1220 

405325 1220 

451009 AA013140 Hs.115707 ESTs 12.18 

423066 Y18264 Hs.120171 ESTs 12.17 

439556 AI623752 Hs.163603 ESTs 12.16 

30 443062 N77999 Hs5963 Homo sapiens mRNA full length Insert cON 12.15 

445873 AA250970 Hs251946 Homo sapiens cDNA: FU23107 fis, clone L 12.14 

453542 AW836724 Hs.33190 Homo sapiens mRNA expressed only in plac 12.11 

440103 AA864968 Hs.127699 ESTs 12.10 

417605 AF006609 Hs52294 regulator of G-protein signalling 3 12.10 

35 440286 U29589 Hs.7138 cholinergic receptor, muscarinic 3 12.04 

420061 AW024937 Hs29410 ESTs 12X12 

458727 AI022813 Hs.92679 Homo sapiens clone COABP0014 mRNA sequen 1156 

445407 AI222658 Hs221889 ESTs, Weakly similar to la casta [D jnela 1155 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (Isofo 1154 

40 414129 AI990287 Hs270788 ESTs 1153 

409799 D11928 Hs.76845 phosphoserine phosphatase-Bca 11.92 

438461 AW075485 Hs236049 phosphoserawejrrirratransferasa 1152 

443912 R37257 Hs.184780 ESTs 1152 

424606 AA343936 gb:EST49786 GaO bladder I Homo sapiens 1150 

45 434217 AW014795 Hs23349 ESTs 1150 

451533 KM-004657 Hs26530 serum deprivafion response (phosphatidyl 1150 

422423 AF283777 Hs.116481 CD72 antigen 1159 

409398 AW386461 pb:PM4-PT0019-121299-O04-F02 PT00I9 Homo 1159 

423853 AB011537 Hs.133466 sTit(Orosophila)homolog1 1152 

50 446180 AI074413 Hs.14220 hypothetical protein FU2O450 1150 

414341 080004 Hs.75909 WAA0182 protein 1150 

406538 11.79 

433253 AW450502 Hs24218 ESTs - 11.79 

447397 BE247676 Hs.18442 E-1 enzyme 11.78 

55 451684 AF216751 Hs26813 CDA14 11.76 

416862 R23765 Hs23575 ESTs 11.74 

425770 NMJJ14363 Hs.159492 spastic ataxia of Chartevoix-Saguenay (s 11.72 

428326 AL048842 Hs.194019 attractin ' 11.72 

433037 NMJM4158 Hs279938 HSPC067 protein 11.72 

60 447476 BE293466 Hs20880 ESTs 11.72 

452092 BE245374 H&27842 hypoiheUcal protein RJ11210 11.72 

412922 M60721 Hs.74870 H2.0 prasophilaHka homeo box 1 1172 

401680 NM_005578 Hs.180398 LIM aomain-containlng preferred transloo 11.69 

422576 BE548555 Hs.118554 CGI-83 protein 1158 

65 450203 AF097994 Hs501528 L-kynurenina/alpha-aminoadipatB amhotra 1158 

410531 AW752953 gb.<WO-Cro224-261099-O35-CjO2 CTQ224 Homo 1157 

425917 W28517 Hs.117167 Homo sapiens cDNA: FU23067 fts, done L 1156 

418693 AI750878 Hs57409 mrombosponcuh 1 1154 

400557 1152 
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416188 BE157260 Hs.79070 v-myc avian myekxytomatosls viral oncog 11j60 

418047 AW952771 K&S0043 ESTs 1159 

420441 AI986160 H&B844S ESTs 1159 

400885 1157 

5 409853 AW502327 gb:UWtr^R0fKika-a-07-OUI.r1 NIH_MGC_5 1156 

400802 11-56 

434540 NM.016045 Hs5184 TH1 (bosophOa homolog 1155 

431449 M55994 Hs256278 tumor necrosis factor receptor supertaml 1155 

425928 855736 Hs238852 ESTs, Weakly dmDar to hypoflieftaJ pro 1154 

10 434701 AA460479 H&4036 KIAA0742 protein 1153 

434228 Z42047 Hs283978 ESTs; KIAA073B gene product 1152 

420729 AW864897 Ha290825 ESTs ■ 1152 

428328 AA426080 Hs58489 ESTs 1150 

433887 AW204232 Hs279522 ESTs 1150 

15 414812 X72755 Hs.77367 monokine Induced by gamma interferon 1146 

457718 F18572 Hs22978 ESTs 1144 

452260 AA453208 Hs28726 RAB9, member RAS oncogene family 1142 

459029 AA131376 H&285203 fibroblast growBi factor 12 11.42 

456267 AI127958 K&83393 cystaSnE/M 1139 

20 433285 AW975944 H&237336 ESTs 1138 

449186 AW291876 Hs.196988 ESTs 1137 

447861 AI434593 Hs.164294 ESTs 1137 

456023 R00028 C#ye70a06.s1 Soares fetal Bver spleen 1136 

439444 AI277652 Hs54578 ESTs 1131 

25 401163 1131 

430886 L36149 Hs248116 chemokine(C motif) XC receptor 1 1128 

450784 AW246803 HS47289 ESTs - 1128 

452391 AL044829 H&29331 carnitine palmlloyltransferase I. muscle 1127 

449625 NM.014253 H&23796 odz (odd Oz/tarwn, DrosophUa) homolog 1 1126 

30 456827 AAD75687 Hs.147176 epidermal growth factor receptor substra 1124 

439328 W07411 Hs.118212 ESTs.Mo<teratelysirnIIartoALU3_HUMANA 1124 

432093 H28383 gb:yi52c03.r1 Soares breast 3NbHBst Homo 1124 

407335 AA631047 Hs.158761 Homo sapiens cDNA FU13054 Us, done NT 1123 

442501 AA3152B7 Hs23128 ESTs 1122 

35 429746 AJ237672 Hs214142 5,10-methylenetetrahydrofelatB reductase 1121 

422858 R35398 gb:yg64g1 OjI Soares Infant brain 1 NIB H 1120 

415156 X84808 Hs.78060 phosphorylase kinase, beta 1120 

446713 AV660122 HS282675 ESTs 1120 

452221 C21322 Hs.11577 ESTs 1120 

40 418261 W78S02 Hs293297 ESTs 11.17 

433332 AB67347 Hs.127809 ESTs 11.16 

434539 AW748078 HS214410 ESTs 11.16 

413471 BE142098 gb£M44fr0137-220999O17-d11 HT0137Homo 11.14 

410037 AB020725 Hs58009 WAA0918 protein 11.14 

45 405601 11.13 

458332 AKM0341 Hs220491 ESTs 11.12 

427654 AA410183 Hs.137475 ESTs 11.12 

427138 N77624 Hs.173717 phosphat'dic acid phosphatase type 2B 11.10 

431475 AI567669 Hs287316 ESTs 11.10 

50 425710 AFO30880 Hs.159275 soiirte earner farriy, member 4 11X18 

413748 AW104057 Hs.19193 ESTs 1137 

409208 Y00093 Hs51077 Intagrin, alpha X (antigen C011C (p150), 1U07 

457278 W92745 Hs.183324 ESTs - 1133 

407021 U52077 gb:Human marlnerl transposase gene, comp 11j02 

55 445701 AFD55581 Hs.13131 lymphocyte adaptor protein 11j02 

408338 AW887079 gb:Mm-SN0rj33-1204«H)02-c10 SN0033 Homo 1095 

401030 BE382701 Hs25960 v-myc avian myelocytorratosis viral relat 1055 

437891 AW006969 Hs.6311 hypothetical protein RJ20859 1024 

453874 AW591783 Hs56131 collagen, type XIV, alpha 1 (unduln) 1054 

60 421562 AA530994 Hs.105803 ghreiin precursor 1052 

413431 AW246428 Hs.75355 ubiquiftvconhigafing enzyme E2N (homolo 1052 

400132 1032 

436420 AA44396S H&31595 ESTs 1050 

424880 NM-000328 Hs.153614 retinitis pigmentosa GTPase regulator 1058 

65 433264 085782 Hs5229 cysteine dioxygenase, type I 1058 

429842 AI366213 Hs.173422 KIAA1605 protein 1057 

412405 AW948128 gbflC0-MTO013-28030O<)31-a12MTO013Horrio 1055 

400615 1050 

425018 BE245277 Hs.154196 E4F transcription factor 1 1050 
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414728 BE488883 Hs£80099 ESTs 9£6 

418485 R91679 Hs.124981 ESTs 9.66 

433480 X02422 Hs.181125 Immunogloburin lambda tocus 9.65 

441530 AE48301 Hs.127112 ESTs 955 

5 433533 053304 Hs.65394 ESTs 9.65 

421470 R27496 Hs.1378 annexmA3 954 

438813 C05569 Ha243122 hypoMcal protein FU13057 similar to 954 

429324 AA488101 Hs.199245 Inacfivation escape 1 952 

450244 AA007534 Hs.125062 ESTs 952 

10 407660 AW063190 H&279101 ESTs 951 

406554 9j60 

428404 AA377607 H&273138 ESTs 958 

447045 AW392394 H&278569 KIAAD064 gerie product 958 

449894 AK001578 Hs24129 hypothetical protein FU10716 958 

IS 448378 AI494332 Hs.196963 ESTs 958 

407902 AL117474 Hs.41181 Homo sapiens mRNA; eONA DKEZp727C191 (fir 956 

446572 AV659151 H&282961 ESTs 956 

459245 BE242623 H&31939 manic fringe (Drosophaa) homobg 955 

423545 AP000692 Hs.129781 chromosome 21 open reading frame 5 954 

20 414697 BE268134 Hs.78927 translocase of outer mitochondrial membr 954 

410848 AW807057 gb:MR4-ST0062-031t99418-b03 ST0062 Homo 952 

421181 NM.005574 Hs.184585 UM*3rnainc^2(rhc^tin-fike1) 952 

427308 026067 Hs.174905 K1AA0033 protein 952 

415995 NMJM4573 H&994 phosj>holtoasaC,b6ta2 951 

25 434848 AW295389 Hs.1 19768 ESTs 951 

414342 AA742181 Hs.75912 Homo sapiens cONA: FU22199 fis, done H 950 

416959 D28459 H&80612 utiqufflrKxnjugating enzyme E2A (RAD6 h 950 

443123 AA094538 Hs.6588 ESTs 950 

439312 AA833902 HS570745 ESTs 948 

30 449375 R07114 H&271224 ESTs 9.48 

436357 AJ132085 rjbiiomo sapiens mRNA for axonemal dyneln 9.44 

458723 AW137726 Hs244352 ESTs,MortoratelysintotolarnWnalph 944 

457526 AW450584 Hs.192131 ESTs, Weakly similar to RIBB [H^apiens] 9.43 

404741 9.43 

35 422409 NM.005428 Hs.116237 vavl oncogene 943 

403708 9.42 

408806 AW847814 Hs.289005 Homo sapiens cONA: RJ2 1532 lis, ctone C 9.42 

417380 T06809 gb£ST04698 Fetal brain, Stratagene (cat 9.42 

422501 AA354690 Hs.144967 ESTs 942 

40 426197 AA004410 Hs.157835 acytCoenzyme A oxidase 1 , palmtoyl 942 

452624 AU076606 K&30054 coagulation factor V (proaccelerin, bbl 9.42 

412110 AW893569 p>flCO44NO021-O4O40<M)21-c1ONN0021 Homo 9.41 

414158 AA361623 H&288775 Homo sapiens cONA FU13900 fts, dona TH 941 

408101 AW968504 Hs.123073 CDC2-related protein kinase 7 940 

45 414171 AA360328 H&865 RAP1A, member of RAS oncogene family 9.40 

415947 U04045 Hs.78934 mutS (E coli) homolog 2 (colon cancer, 9.40 

426959 BE262745 gb£01 153869F1 NtH_MQC_19 Homo sapiens c 9.39 

417519 AI689987 Hs.177669 ESTs, Weakly similar to RMS1_HUMAN REGUL 959 

457181 BE514362 H&296422 FK506-bInding protein 3 (25kD) 959 

50 402835 9.38 

404632 958 

446566 H95741 Hs.17914 Homo sapiens cONA; FU22801 fis, done K 957 
455369 AW903533 gb£M1-NN1031-060400-178-d05NN1031 Homo * 957 

444001 AKM5087 Hs.152299 ESTs, Moderately similar to ALU5_HUMAN A 956 

55 458191 AI420611 Hs.127832 ESTs 956 

431374 BE258532 Hs251871 CTP synthase 954 

429327 AA283981 Hs.199248 prostaglandin E receptor 4 (subtype EP4) 9.33 

407081 X97748 gbiH-saplens PTX3 gene promoter region. 953 

416987 BE616731 Hs.80645 interferon regulatory factor 1 953 

60 423013 AW875443 ' Hs22209 secreted modular calcium-binding protein 953 

439461 AAS93960 Hs.103158 ESTs 9.33 

418830 BE513731 Hs58959 Human ONA sequence from done 967N21 on 9.32 

422763 AA033699 Hs.83938 ESTs, Modarately.similar to MASP-2 [H.sa 952 

442739 NM.007274 Hs5679 cytosolic acyl coenzyme A thioestar hydr 952 

65 452859 AB00555 Hs588158 Homo sapiens cDNA: FU23591 fis, clone L 952 

403237 9.32 

415000 AW025529 Hs239812 ESTs, WeaMy similar to CALM_HUMAN CALMO 951 

417951 AW976410 Hs.289069 Horno sapiens cDNA:FU21016 fis, done C 950 

419066 Z98492 Hs.6975 PRO1073 protein 950 
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448443 AW167128 Hs231934 ESTs 920 

405125 920 

409768 AW499568 gbi)l+IFflRQKlJHiOJ<MJ|j1 N1H_M6C_5 928 

453708 A1191811 H&54S29 ESTs 928 

5 442271 AFD00652 Hs.8180 syndecan binding protein (syrrtenln) 927 

410055 AJ250839 H&58241 gene tor serto/hreonine protein kinase 926. 

448692 AW013907 Hs224276 ESTs, Moderately slmBar to predicted us 926 

417381 AF164142 Hs22042 solute canter femSy 23 (nucteobase Ira 925 

422497 029642 Hs.1528 KIAA0O53 gam product 925 

10 414140 AA281279 Hs23317 ESTs 924 

435980 AF274571 Hs.129142 ESTs; Weakly simSar to DEOXYRIBONUCLEAS 924 

458530 BE395035 Hs.199889 ESTs, Weakly sirriar to KIAA0874 protein 924 

402585 924 

420819 AA280700 gb2s95h1U1 NCLCGAP_GC81 Homo sapiens 923 

15 444755 AA431791 Hs.183001 ESTs 922 

411630 U42349 Hs.71119 Putative prostate cancer tumor suppresso 922 

421246 AW582962 Hs200961 ESTs, Highly sMartoAF151805 1 CGI4 920 

421924 BE514514 Hs.109606 coronln, acfin-otndlng protein, 1A 9.19 

414888 AIX39185 Hs.77558 thyroid hormone receptor interactor 7 9.18 

20 434267 AI206589 Hs.116243 ESTs 9.17 

409213 U61412 Hs21133 PTK6 protein tyrosine kinase 6 9.17 

428242 H55709 Hs2250 leukemia inMory factor (choBnerglo 9.16 

451736 AW080356 Hs293684 ESTs, Weakly similar to alternatively sp 9.15 

413627 BE182082 Hs246973 ESTs 9.14 

25 416134 AA528402 Hs.74861 activated RNA polymsrase II transcrlptio 9.14 

449251 AW151660 Hs21444 ESTs 9.14 

452813 U54727 Hs.191445 ESTs 9.14 

443622 AB11527 Hs.1 1805 ESTs 9.14 

413260 BE075281 gb:P M1-BT058529Q200<)05407BT0585Homo 9.12 

30 413450 Z99716 Hs.75372 N-acetylgalactosamlnidase. alpha- 9.12 

446442 BE221533 HS257858 ESTs 9.12 

438540 AA810021 Hs.1 36306 ESTs 9.12 

426251 M24283 Hs.168383 Intercellular adhesion molecule 1 (CD54) 9.11 

410290 AA402307 H&73818 ublqulnol-cytochrama c reductase hinge p 9.10 

35 437398 AA913736 Hs.126715 ESTs 9.10 

421559 NMJ014720 Hs.105751 Ste20-relaled serine/threonine kinase 9.10 

439699 AF086534 Hs.187561 ESTs, Moderately similar to ALU1_HUMAN A 9.10 

430799 C19035 Hs.164259 ESTs 9.09 

424544 M88700 Hs.150403 dopa decarboxylase (aromatic L-amho ad 9.08 

40 453942 AW190920 Hs.19928 ESTs 9.08 

425844 T68073 Hs.169628 serine (or cysteine) proteinase inhMo 9.08 

434658 AI624436 Hs.1 94488 ESTs 9.07 

453999 BE328153 Hs240087 ESTs 9.06 

436490 R71543 Hs.18713 ESTs 9.05 

45 409192 AA065131 HS233439 ESTs, Weakly slmBar to ALU7_HUMAN ALU S 9.05 

446223 BE300091 Hs.1 19699 hypoflieflcal protein RJ12969 924 

447247 AW369351 Hs287955 Homo sapiens cDNA RJ13090 lis, done NT 9.04 

450094 AI174947 Hs295789 Homo sapiens mRNA; cONA DKFZp5640t 164 (f 924 

432012 AW301344 Hs.195969 ESTs 924 

50 422520 AU076730 Hs.1 17977 Knesln 2 (60-70kD) 9.02 

418650 BE386750 HS2697B prolyl ertdopeptHase 922 

423008 M81590 Hs.123016 S^roxytryptarnine (serotonin) receptor 9.02 

436476 AA326108 Hs23631 ESTs - 9X2 

448206 BE622585 Hs.3731 ESTs 9X2 

55 431574 AW572659 Hs261373 adenosine A2b receptor pseudogene 9X1 

443453 R99876 Hs269882 ESTs 9X1 

435472 AW972330 Hs283022 triggering receptor expressed on myeloid 9X1 

420337 AW295840 Hs.14555 Homo sapiens cONA: RJ21513 fc, done C 9X0 

449810 AB008681 Hs23994 adMn A receptor, type IIB 9X0 

60 406780 AA9Q2386 Hs286 ribosomal protein 14 829 

429169 AW341130 Hs.197757 ESTs, Moderately similar to FGFE_HUMAN F 8.99 

421326 AF051428 Hs.1 03504 estrogen receptor 2 (ER beta) 827 

425491 AA883316 Hs255221 ESTs 826 

425516 BE000707 Hs29567 ESTs 8.96 

65 439773 AI051313 Hs.143315 ESTs 8.96 

443247 BE614387 Hs.47378 ESTs 826 

456623 AI084125 Hs.108106 transcription (actor 825 

438707 L08239 Hs.5326 porcupine 825 

402240 825 



185 



WO 02/30268 



PCT/US01/32045 



444152 AI12S694 Ks.149305 Homo sapiens cDNAFU14264fis, done PL 8.95 

409842 AW501758 gbAJmF6R0p^iHX»(WJLf1 NIH_MGC_5 834 

416277 W78765 Hs.73580 ESTs 8.94 

456697 AI308006 H&111334 ferritin, Sght potypepfide 8.94 

5 410762 AF226053 Hsj66170 HSKM-B protain 8.92 

412942 AL120344 Hs.75074 rrttoasrv-edivated protein Hrase-acBvat 8.82 

442320 AT287817 Hs.129636 ESTs 8.92 

449673 AA0O2064 Hs.18920 ESTs 8.91 

411486 N85785 Hs.181165 eukaryotic translation elongation factor 8.90 

10 437916 BE566249 H&20999 Homo sapiens cDNA: HJ23142 fis, done L 830 

442732 AA257161 Hs3658 hypoftetical protein DKFZp434E0321 839 

419741 NM_0Q7019 Ks.93002 ubiquitin carrier protein E2-C 8.89 

411499 AWB49292 gb:IL3<rr0215^)2030(H)90-E06 CT0215 Homo 839. 

431154 AW971228 Hs.290259 ESTs 839 

IS 414922 DO0723 Hs.77631 glycine cleavage system protein H (amino 838 

418036 Z37976 Hs33337 latent transforming growth factor beta b 837 

405422 837 

422926 NM_016102 Hs.121748 ring finger protein 16 837 

435220 050030 Hs.104 HGF activator 836 

20 418203 X54942 Hs.83758 COC28 protein kinase 2 836 

418613 AA744529 Hs.86575 mitogerwcBvated protein kinase kinase 835 

.439250 H66566 Hs.271711 ESTs 835 

432359 AA076049 H&274415 Homo sapiens cDNA RJ10229 Rs, done HE 834 

450000 AI952797 Hs.10888 .Homo sapiens cDNA: FU21559 fe, clone C 8.83 

25 425657 T89839 Hs.119471 ESTs 833 

425694 U51333 Hs.159237 hexokhase3(vMace!l) 832 

419972 AL041465 H&294038 ESTs, Moderately similar to ALU2_HUMAN A 832 

436396 AI683487 Hs.299112 Homo sapiens cDNA HJ1 1441 fis, dona HE 832 

413413 082520 Hs301834 Homo sapiens CDNAFU10952 fis, done PL 8.82 

30 428807 AA435997 Hs.104930 ESTs 832 

415839 R40611 Hs.137565 ESTs 831 

419553 N34145 Ha250614 ESTs 830 

420309 AW043637 Hs31766 ESTs 830 

421863 A1952677 Hs.108972 Homo sapiens mRNA; cDNA DKFZp434P228 (fr 8.80 

35 447965 AW292577 Hs34445 ESTs 830 

459172 BE063380 gb:PM0-BT0275-291099<)02-g10BT0275Horno 8.80 

403259 8.78 

411534 AW850473 gb:ll3-CT0219-28010(H)61-B11 CT0219Homo 8.78 

456161 BE264645 Hs382093 Homo sapiens cDNA: FU21918 fis, clone H 8.77 

40 413654 AA331881 Hs.75454 perodredoxin 3 8.76 

401744 8.76 

425348 AL137477 Hs.155912 cadherin-fike 24 8.76 

423396 AI382555 Hs.127950 bronwdomairwontaWng 1 8.75 

450649 NM.001429 H&297722 Human DNA sequence from done RP1-85F18 8.75 

45 408331 NMJ007240 Hs.44229 dual specificity phosphatase 12 8.74 

423872 AB020316 Hs.134015 uronyl2-sulfotransterase 8.74 

424903 A1566086 Hs.153716 Homo sapiens mRNA for Hmob33 protein, 3 8.74 

427596 AA449506 Hs.179765 Homo sapiens mRNA; cDNA DKFZp586H1921 (f 8.73 

432488 AA5S1010 Hs316640 ESTs 8.72 

50 448980 AL137527 Hs32703 Homo sapiens mRNA; cDNA DKFZp434P1018 (f 8.72 

429455 AI472111 Hs392507 ESTs 8.71 

429855 AW385597 Hs.138902 ESTs, Weakly sWar to B34087 hypotheti 8.71 

441746 H59955 Hs.127829 ESTs * 8.70 

411945 AL033527 Hs.92137 v-myc avian myelocytomatosls viral onoog 8.70 

55 413492 D87470 Hs.75400 KIAAQ280 protein 8.70 

435706 W31254 Hs.7045 GL004 protein 8.70 

433741 AA609019 Hs.159343 ESTs 8.70 

426340 Z97989 Hs.169370 FYN oncogene related to SRC, FGR, YES 839 

422779 AA317036 Hs.41989 ESTs 837 

60 4497B5 AI225235 Hs.288300 Homo sapiens cDNA: FLJ23231 fis, done C 837 

420144 AA811813 Hs.1 19421 ESTs - 8.66 

420235 AA256756 Hs.31178 ESTs 8.66 

432606 NM.002104 Hs.3066 granzymaK (serine protease, granzyme 3; 836 

425762 BE244076 Hs.159578 Homo sapiens mRNA for FU00020 protein, 8.65 

65 427448 BE246449 Hs.2157 Wiskott-Aldrich syndrome (eczema-fhrambo 834 

418033 W68180 H&259855 Homo sapiens cDNA HJ12507 fis, done KT 8.64 

429084 AJ001443 Hs.195614 spndngfactor3b,subunlt3,130kD 8.64 

417094 NM.006895 Hs.81182 histamine NHnetfryftransferase 8.64 

457277 NM_004736 Hs227656 xenotroplc and potytroplc retrovirus rec 8.63 
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422631 BE218919 Hs.1 18793 hypothefoal protein FU10688 853 

410679 AW79S1S6 Hs£15857 ring finger protein 14 8.63 

431585 BE242B03 Hs.262823 hypoBieBcal protein RJ10326 8.62 

401851 852 

5 401866 852 

407783 AW996872 Hs.172028 adfetntegiinardineiaikJpiDtBlnasadoma 852 

408242 AA251594 Hs/3913 PI8F1 gene product 8.62 

422250 AW408530 Hs.1 13823 apX(caselnoryfic protease X,Ecoii) 852 

430259 BE550182 Hs.127826 RalGEFtke protBin 3, mouse homolog 8.62 

10 452598 AI831594 Hs.68847 ESTs, Weakly slmiar to ALU7_HUMAN ALU S 8.62 

419541 AW749617 gbfiC3^T0502-1301(XK)12-g07 BTQ502 Homo 160 

428839 AI767756 Hs.82302 ESTs 8.60 

429328 AA829402 Hs/7939 ESTs 8.60 

451491 AI972094 H&286221 Homo sapiens cONAFUII 3741 Gs, clone PL 8.60 

15 452561 AI692181 Hs/9169 WAA1634 protein 8.60 

420027 AF009746 Hs54395 ATP-btriding cassette, sub-family D (ALU) 8.60 

435205 X54138 Hs.181125 irrmunoglt*uiln lambda locus 8.60 

430900 U91939 rte248123 Q protein-coupled receptor 25 8.60 

405074 859 

20 437991 AI479773 Hs.181679 ESTs 859 

436346 BE328882 Hs.193096 ESTs, Modsrately similar to U1 19_HUMAN U 858 

411079 AA091228 gbxchn2152^eq.F Human fetal heart, Lam 857 

41B452 BE379749 Hs55201 Otype (calcium dependent, carbohydrate- 856 

429109 AL008637 Hs.196352 . neutropha cytosoDc factor 4 (40©) 856 

25 448019 AW947164 Hs.195641 ESTs 856 

449865 AW204272 Hs.199371 ESTs 855 

431180 H55883 gb:yq94M&r1 Scares (etal liver spleen 854 

445988 BE007663 Hs.13503 tnacSvaSon escape 2 854 
405876 

30 407235 D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 854 

414807 AI738616 Hs.77348 rr/droxyprostaglandlri dehydrogenase 15-(N 854 

425871 AF193612 Hs.159142 lunatic fringe (Drosophlla) hrmiolog 854 

452413 AW082633 Hs£12715 ESTs 854 

421620 AA446183 Hs.91885 ESTs 853 

35 444539 AI955765 Hs.146907 ESTs 852 

415102 M31899 Hs.77929 excision repair cross-complementing rods 851 

405552 851 

418068 AW971155 Hs293902 ESTs, Weakly similar to prolyl 4-hydroxy 850 

420133 AA426117 Hs.1 4373 ESTs 850 

40 438887 R68857 Hs£65499 ESTs 850 

446468 AI765890 Hs.16341 ESTs; Moderately similar to HU ALU SUB 850 

446585 AV659397 Hs.282948 ESTs 850 

441896 AW891873 gbCM3^T009OO40500-173-b02 NT0090 Homo 850 

437718 AI927288 Hs.196779 ESTs 8.48 

45 420656 AA27909B Hs.187636 ESTs 8.48 

429303 AW137635 Hs/4238 ESTs 8/8 

450624 AL043983 Hs.125063 Homo sapiens cDNA FU13825 OS, done TH 8.48 

452573 AI907957 Hs287622 Homo sapiens cDNA FU14082 fis, done HE 8.48 

456341 AA229126 Hs.122647 N-myristoyllrarisferase2 8.48 

50 423024 AA593731 Hs.75613 CD36 antigen (collagen type I receptor, 8/7 

446985 AL038704 Hs.156827 ESTs, Weakly similar to ALU1J1UMAN ALU S 8/6 

431778 AL080276 Hs268562 regulator of G-protBfn dgnarrmg 17 8.46 
400268 " 8/6 

421828 AW891965 H&289109 dimethylargininedimeftylaminohydrolase 8.45 

55 417022 NM 014737 Hs50905 Ras association (Ra!G0S/AF-6) domain fam 8.44 

421029 AW057782 Hs293053 ESTs 8.44 

425171 AW732240 Hs500615 ESTs 8/4 

459070 AI814302 gb^ri71c12J(1 NCI_CGAP_Lu19 Homo sapiens 8.42 

406006 * 8/2 

60 412643 AW971239 Hi293982 ESTs 8/2 

424775 AB014540 Hs.153026 SWAP-70 protein 8/2 

446848 AW136083 Hs.195266 ESTs, Weakly similar to S59501 interfero 8.42 

448043 AI458653 Ha201881 ESTs 8.41 

407183 AA358015 gfcEST66864 Fetal lung III Homo sapiens 8.40 

65 412324 AW978439 Hs.69504 ESTs 8/0 

419594 AA013051 Hs.91417 topoisomerase (DMA) II binding protein 8.40 

430968 AW972830 gb£ST384925 MAGE resequences, MAGL Homo 8.40 

431689 AA305688 Hs267695 UDP-GatbetaGlcNAcbeta 1,3-galac(osyar 140 

438582 AI521310 Hs383365 ESTs, Weakly similar to ALU5_HUMAN ALUS 8.40 
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447685 AL122043 Hs.19221 hypofhefical protein DKFZp566G1424 8.40 

459119 AW844498 Hs589052 Homo sapiens LB^mRNA, variant C, part 138 

400817 137 

425265 BE245297 gb.TC8AP1E2482 Pediatric pre-8 cell ecut 137 

S 409385 AAQ71267 gb3m61g01/1 Stratagena toroblast (937 138 

439121 BB047779 H&44701 ESTs 138 

4199S8 X04430 Hs53913 Intarteirkm 6 (interferon, beta 2) 138 

408327 AW182309 H&249963 ESTs, Highly similar to dJI 170K4.4 [Usa 855 

403976 854 

10 448064 AA37903S gb£ST91809 Synovial sarcoma Homo sapfen 853 

442914 AW188551 H&99519 Homo sapiens cDNA FU14007 fis, clone Y7 853 

428032 AW997704 H&11493 Homo sapiens cONARJI 3536 fis, done PL 852 

434194 AF119847 Hs583940 Homo sapiens PRO1550 mRNA, partial cds 852 

458677 AW93767D Hs554379 ESTs 852 

15 420925 NM_0156S8 Hs.100391 T54 protein 850 

416475 T70298 gbyd26gOis1 Soares fetal Bver spleen 850 

416852 AP283776 Hs50285 Homo sapiens mRNA; cDNA OKFZp586C1723 (1 850 

430676 AF084866 gbJtomo sapiens envelope protein R1C-3 ( 850 

428455 AI732694 Hs58520 ESTs 129 

20 435343 AW194362 Ks.199028 ESTs 129 

450783 BE266695 gb£01 190242F1 NIH_MGC_7 Homo sapiens cD 129 

404946 828 

422942 AF054839 Hs.122540 tetraspan2 128 

453716 AA037675 Hs.152675 ESTs 128 

25 437098 AA744488 Hs.132842 ESTs, Moderately sMar to ALU1_HUMAN A 858 

443907 AU076484 K&9963 TYRO protein tyrosine kinase binding pro 857 

401930 AF106069 Hs53168 ubiquKin specific protease 15 856 

446554 AA151730 Hs501789 ESTs, Weakly similar to similar to Cele 856 

426290 AB007918 Hs.169182 K1AA0449 protein 125 

30 419904 AA974411 Hs.18672 ESTs 855 

413886 AW958264 Hs.103832 ESTs, Weakly similar to TRHYJ1UMAN TR1CH 854 

424738 AI963740 H&46826 ESTs 854 

427359 AW020782 Hs.79881 Homo sapiens cDNA: FU23006 fis, clone L 854 

424534 D87682 Hs.150275 K1AA0241 protein 854 

35 424429 U63830 Hs.146847 TRAF family member-associated NFKB activ 854 

442604 BE263710 Hs579904 ESTs 852 

442992 AI914699 Hs.13297 ESTs 852 

427210 BE396283 Hs.173987 eukaiyotic translation inltiatJrai factor 852 

457229 BE222450 Hs566390 ESTs 851 

40 423730 AA330214 gbfST33935 Embryo, 12 week II Homo sapl 851 

411928 AA888624 Hs.19121 adaptor-related protein complex 2, alpha 850 

416051 AA835868 Hs55253 Homo sapiens cDNA: FU20935 fis, done A 850 

417231 R40739 Hs51326 ESTs 850 

422049 W25760 Hs.77631 glycine deavage system protein H (amino 850 

45 427528 AU077143 Hs.179565 rrurWirornosome maintenance deficient (S. 850 

458776 AV654978 Hs.19904 cystathionasa (cystathionine gamma-lyase 8.19 

417687 AI828596 Hs550691 ESTs 8.18 

423218 MMJ015896 Hs.167380 BLu protein ' 8.18 

425397 J04O88 Hs.156346 topoisomerase (DMA) II alpha (170kD) 8.18 

50 406964 M21305 Hs547946 Human alpha satellite and satellite 3 |u 8.18 

402401 U42349 Ks.71119 Putative prostate cancer tumor suppresso 8.18 

423397 NM_001838 Hs.1652 chemokine (C-C motif) receptor 7 8.18 

427857 AL133017 Hs5210 thyroid hormone receptor Interactor 3 - 8.17 

401519 8.17 

55 447188 H65423 Hs.17631 Homo sapiens cOMA FU201 18 fis, done CO 8.16 

424704 AI263293 Hs.152096 cytochrome P450, subfamily l!J (arachido 8.16 

435S54 AJ278120 Hs.4996 DKFZP564D166 protein 8.14 

448556 AW885606 Hs.5064 ESTs 8.14 

449217 AA27B533 Hs53262 rfbonudease, RNase A family, k6 8.14 

60 453124 AI139058 Hs53296 ESTs 8.14 

442812 AI018406 Hs.131284 ESTs 8.14 

421129 BE439899 Hs.89271 ESTs 8.14 
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TABLE 9A shows the accession numbers for those primekeys lacking a unigenelD in Table 
9. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



10 Ptey. Unique Eos probeset Identifier number 

CAT number. Gene duster number 

Accession: Genbank accession numbers 



15 Ptey CAT number Accession 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



410896 1226053.1 



408057- 1035720_-1 AW139565 

408069 103655 1 H81795 Z42291 R20973 AA046920 

408182 104479 1 AA047854 AA057S06 M053841 

408338 1052148 1 AW867079 AW8S7086 AW182772 

408828 108463 1 BE540279 AW410659 AA057857 R77693 BE278B74 

409126 110159J AA063426 AW962323 AW408063AA063503 AA772927 AW753492BE175371 AA311147 

409292 111586 1 AA071051 AA070584 AA069938 M102138 AA074430 

409314 111841 1 AA070266 AA084957 AA12S998 

409385 112523 1 AA07I267 T65940 T64515 AA071334 _ 

409398 1126716 1 AW386461 AW876408 AW386672 AW386599 AW876258 AW3865 1 9 AW386289 AW876136 AW876203 AW876213 AW876301 

AW876295 AW876349 AW876365 AW876160 AW876369 AW876352 AW876271 

409671 114731 1 AA076769 AA076781 AI087968 

409768 1154035.1 AW499566 AW502378 AW499522 AW502046 AW502671 AW501917 AW501868 AW501721 AW502813 

409841 1156088.1 AW502139 AW5Q2432 AW502235 AW501633 AW502647 

409842 1156119.1 AW501756 AW502096 AW502465 AW50171S 
409853 1156226 1 AW502327 AW502488 AW501829 AW502625 AW502687 
410531 1207200 1 AW752953 H88044 BE156092 

410888 1216101.1 AW796342 AW796356 BE161430 

410846 1223902.1 AW807057AW807054AW807189AW807193AVIB^ 
AW807331 

AWB09637 AWB09697 AW810554 AWB09707 AW809885 AW810000 AW810088 AW809742 AW809816 AW809749 AW809S39 
AW809722 AW809836 AW809774 AW81 0023 AW810013 AW809813 AWB09660 AW809728 AW809768 AYI/809951 AWB09657 
AW809954 

411079 123128 1 AA091228 H71860 H71073 

411424 1245497 1 AW845985 AW845991 AW345952 

411499 1248105 1 AWB49292 AW849431 AW849422 AW849428 AW849420 AW849424 AW849427 

411507 1248607~1 AW850140AW850195AW850192 

411534 1248827 1 AW850473AW850471 AW850431 AW850523 

411972 1268491 1 BE074959AWB80160 

4121 10 1277844 1 AW893569 AW893571 AW893588 AWB93593 

412226 1284289 1 W26786AW998612 AW902272 

412257 1285376.1 AW9O3330BE071916 

412405 1293012.1 AW948126 AW948139AVV948196AW948145AW948162AV^ 

AW948131 AW948158 AW948164 AW948151 

413260 1356003.1 BE075281BE075219BE075123BE075119BE075046 

413471 1371778 1 BE142098BE142092 

413729 13B5114J BE1S9999BE160056BE160107BE160139 

414182 142409 1 AA136301 A1381776 AA138321 

414989 1511339 1 T8166BC19040C17569 

415354 1534763.1 F06495 R24338 R13046 

416011 1566439 1 H14487 R50911 Z43216 

416475 1596398 1 T70298 H58072 R02750 

417380 1672461.1 T06809N75735 

419392 1843934.-1 W28573 

419541 185724.1 AW749617R64714AA244138AA244137BE094019 

419544 185760.2 AJ909154 AA526337AA244193AI909153 

420819 196721 1 AA280700 AW975494 AA687385 

421245 200620 1 AA285383 AA285333 AA285359 AA285326 AA285350 

422673 219674J N59027AA314694 N53937 R08100 
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422695 219998J AA3151S8 AW961298 N76067 AW802759 AI85B495 W04474 

422858 222209J R35398BE252178AA318153 

422940 223108 1 BE077458 AA337277 AA319285 

423730 231462J AA3302U AW962519 754709 

423790 232031 1 BE152393 AA330984 BE073904 

424385 238731 1 AA339668AWS52809AA349119 

424606 241409 1 AA343936 AA344060AW963081 

425265 249175 1 BE245297 AA353976 AWS05023 

426959 273B3D_-1 BE262745 

430676 32168J AF084866 AF084870 AF084864 AF084867 AF084869 AF084865 AF084868 AW818206 AW812038 BE144813 BE144812 
AW812041 AW812040 AW812067 BE061583 BE061604 TO5808 AB52469 AA580921 BE141783 BE141782 BE061601 
AW814393AWB85029 

430968 326269 1 AW972830 AA527647 AA489820 AA570362 

431180 328906 1 H55883 AW971249 AA493900 H55788 

432093 341283 1 H28383 AW972670 H28359 AAS25808 

434598 38937 1 T59538 T59589 T59598 T59542 AF147374 

436357 41842J AJ132085 Z83805 

437159 43393J AUK0072 AW800148 

437495 43765J BE177778 BE177779 AL390180 AA359908 

439097 46858 1 H66948AF085954 H66949 

439120 46879J H58389AF085977 H56173 

440134 48675 1 BE410734 BE560117 BE270054 BE296330 BE267957 AI003007 BE545259 

441896 52842_1 AW891873 AW891897 BE554764 

445629 645787 1 AK45701 BE272724 

447229 71288 1 BE817135AW504051 AW504283 

448064 74761 1 AA379036 AA150589 AI696854 BE621316 

450783 84655J BE266695 BE265474 N53200 BE267333 

451045 85673.1 AA215672 AI696628AAO13335H86334AA017006 

452549 921602.1 AI907O39 AI907081 

452560 922216 1 BE077084 AW139963 AW8S3127 AW306209 AW803204 AW806205 AW806206 AW80621 1 AW806212 AW806207 AW806208 
AW806210AI907497 

452712 928309 1 AW^16AW838660BE144343A1914520AW888910BE184854BE1847B4 

453758 980026.1 U83527AL120938U83522 

454093 1007366.1 AWB60158 AWS62385 AW860159 AW862386 AW862341 AW821869 AW821893 AVU062660 AW062656 

454563 1224342.1 AW807530 AW807540 AWB07537 AW846086 BE141634 AW846089 AW807499 AW807533 AW838499 

454791 1234759.1 BE071874 BE071882 AW820782 AW821007 

454977 1247099J AW848032 AW848630 AW848478 AW848623 AW848484 AW848169 AW848830 AW848149 AW8481 19 AW848893 AW848903 
AWB48407 

455131 1254674.1 AVV857913AW857916AW857914AWB61627AVV861626 AW861624 

455183 1259023.1 AW884111 AW863918 AW863856 

455254 1266449 1 AW877015 AW877133 AW876978 AW877071 AW876988 AW877069 AW877053 AW877013 

455369 1285173J6 AW903533AWmi6AW903562BE085202BE085215BE085214BE085209BE085172 

~ B £055199 

455982 1396849 1 BE176862BE176876BE176947BE176878 

456011 1410860~1 BE243628 BE246081 BE247016BE241984BE241534BE246091 BE245679BE243620BE245998BE242329BE241417 

BE241457 BE242522 BE241989 BE241464 

456023 1416335.1 R00028BE247630 

457586 360505 1 AW062439 AW751554AA579463 

457595 364225.-1 AA584854 

457751 399422.1 AI908236 AA663731 

459070 883688.1 AI814302 A1814428 

459081 889426 1 W07808AI822066 

459145 918957 1 AI903354 AI903489 AI903488 

459172 921149.1 BE063380BE063346 AB06097 

459234 945240_-1 AI940425 
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TABLE 9B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 9. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Ptey. Unique number corresponding to an Eos probeset 

Ret: Sequence source. Tha 7 digH numbers In this column are Ganbank Identifier (Gl) numbers. "Dunham I. et al" refers to the 

publication entiUad The DNA sequence of human chromosome 22.' Dunham LetaL, Nature (1999) 402489-495. 
Strand: IndicatesDNA strand from which exonswerepredictsd. 

Ntjosfflon: Indicates nucleotide positions of predicted axons. 



Ptey Ref 


Strand 


NLposlUon 


400452 8113550 


Minus 


90308-90505 


400557 9801261 


Plus 


208453-208528,209633-209813 


400615 9908994 


Plus 


118036-118166,11868M18807 


400802 8567867 


Minus 


174571-174858 


400817 85S9994 


Plus 


170793-170948 


400880 9931121 


Plus 


29235-29336,3636336580 


400885 9958187 


Minus 


58242-58733 


400926 7851921 


Minus 


52033^158^3956^120^4957-55052^5420-5548056452-56666^7221-57718 


400952 7658481 


Plus 


192667-192826,194387-194876 


400991 8096825 


Plus 


159197-159320 


401044 8117619 


Plus 


73501-73674 


401124 8570296 


Minus 


124181-124391 


401163 6981820 


Plus 


5302-5545 


401201 9743387 


Minus 


138534-138629,139234-139294,140121-140335,142033-142479 


401286 9801342 


Minus 


147036-147318 


401384 6850939 


Minus 


58360-58545 


401468 6433826 


Plus 


13056-13482 


401515 7630851 


Plus 


29929-30126 


401519 6649315 


Plus 


157315-157950 


401672 9838136 


Plus 


128526-128704,130755-130860 


401744 2576349 


Phis 


14595-14751 


401851 7770425 


Minus 


146443-146664,147794-147971,148351-148480,148980-149111,149801-149949 


401866 8018106 


Plus 


73126-73623 


402240 7690131 


Plus 


104382-104527,106136-106372 


402359 9211204 


Minus 


40403-41961 


402585 9908890 


Minus 


174893-175050,183210-183435 


402788 9796102 


Plus 


98273-101430 


402802 3287156 


Minus 


53242-53432 


402812 6010110 


Plus 


25026-25091,25844-25920 


402828 8918414 


PUIS 


69071-69642 


402835 9187337 


Plus 


26961-27101 


402838 9369121 


Minus 


32589-32735,35478-35666 


402842 9369121 


Minus 


76355-76479 


402895 9967547 


Plus 


85537-85671,86379-86469 


402964 9581599 


Minus 


46624-46784 


403137 9211494 


Minus 


92349-92572,92958-93084,93579-9371253949-94072^4591-94748,95214-95337 


403237 7637807 


Phis 


7271-7527 


403259 7770585 


Phis 


4693-4857 


403683 7331517 


Plus 


217175-217446 


403690 7387384 


Minus 


78627-79583 


403708 5705981 


Minus 


134394-134812 


403838 4176355 


Plus 


19197-19502 


403851 7708872 


Plus 


22733-23007 


403976 7657840 


Plus 


24755-24969 


404407 7329316 


Minus 


48154-48499 


404426 7407959 


Plus 


77842-77954 


404632 9796668 


Phis 


4509845229 


404741 8574139 


Phis 


143025-143467 


404756 7706327 


Plus 


8284943627 


404946 7382189 


Plus 


134445-134750 


405074 7770440 


Plus 


4434044559,4479045059 


405125 8247873 


Plus 


137113-137814 


405172 9966752 


Plus 


153027-153262 
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*tw9COO f CttvVf U 


Minus 


151699-151915 




Minus 


25818-26380 




WUjttKf 


17503-17778 IRflPI-lftPQfl 




Mimic 




^ncceo 1RS9S06 


Pius 


4519^45647 


*nJQOUI WHvww 


Minus 


14783S-1 47935 149220-149299 


405685 450B129 


Minus 


37956-38097 


405777 7263187 


Minus 


104773-105051 


405856 7653009 


Plus 


101777-102043 


405876 6758747 


Plus 


39694-40031 


405932 7767812 


Minus 


123525-123713 


405934 6758785 


Plus 


159913-160605 


406006 8247801 


Minus 


4264042776 


408134 9163473 


Pius 


153291-153452 


406189 7289992 


Minus 


22007-22234 


408422 9256411 


Plus 


163003-163311 


406516 7711422 


Minus 


128375-128449,128560-128784 


406538 7711478 


Plus 


3519645367.38229-38476,40080-40216,43522-43840 


406554 7711566 


Plus 


106956-107121 


406577 7711730 


Plus 


11377-11509 
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TABLE 10: shows genes, including expression sequence tags differentially expressed in 
taxol resistant prostate tumor xenografts as compared to taxol sensitive prostate tumor 
xenografts. The genes are indicated as either being upregulated or downregulated during the 
induction of taxol resistance in sequential passages of the grafts. 



Ploy. Unique Eos probasat identifier number 

ExAocrt Exemplar Accession number, Genbank aooesston numbar 

UnlgenelD: Unigene number 

Unigene Title: Unpens gene Hie 

Eos: Internal Eos name 

FOO-FM: passage number 



Pkey ExAccn 


UmgenelD UnlgenTWe Eos RespJ=00 


F00 


F02 


F02 


F05 


F05 


F07 


F09 


F10 


F11 


F13 


F14 


117921 N51002 


Hs.47170 UpiinA2 PM2BUP 1 


9 


8 


9 


32 


20 


34 


122 


105 


82 


71 


111 


112971 T17185 


H&4299 ESTs CHA1down290 


281 


267 


335 


270 


284 


150 


157 


83 


89 
1 


49 
1 


75 


126645 AI167942 


Hs£1635 STEAP PAASdown 106 


111 


103 


71 


34 


67 


33 


14 


2 


1 


119018 N95796 


Hs.179809 ESTs PA82 down 765 


841 


757 


909 


742 


704 


478 


428 


253 


175 


228 


238 
84 


110844 N31952 


Hs.167531 ESTs PAV7down175 


192 


147 


141 


123 


129 


73 


65 


55 


48 


54 


100654 HG2841-HT2969 Hs.75442 Albumin, A PM01 (town 666 


605 


504 


728 


357 


445 


602 


187 


117 


127 117 113 


100655 HG2841-HT2970 Hs.75442 Albumin, A PM02down620 


653 


486 


688 


368 


386 


606 


175 


101 


95 115 97 


102076 U09579 


H&252437 o/cMep PM03down 101 


94 


143 


190 


105 


107 


88 


40 


34 


31 


46 


22 


102208 U22961 


Hs.75442 albumin PM04down495 


424 


323 


518 


252 


296 


497 


188 


169 


143 


165 


145 


103739 AA075779 


mtochondr PM05down 75 


190 


606 


230 


378 


106 


218 


88 


69 


192 


69 


99 


107036 AA599690 


Hs.15725 SBBI48 PM06down87 


124 


115 


188 


132 


111 


66 


71 


49 


70 


38 


50 


108242 AA062746 


ESTs PM07down14 


20 


252 


13 


22 


43 


193 


10 


10 


104 


21 


13 


108282 AA065143 


sotutscar PMOSdown 27 


54 


178 


73 


108 


37 


63 


24 


14 


53 


15 


34 


108679 AA1159S3 


beta-1-gb PM09down680 


893 


1292 656 


869 


389 


1 


74 


118 


662 


359 
1 


409 


108731 AA126313 


Hs.107476 ATPsynthaPMIOdownlO 


19 


185 


25 


60 


1 


32 


3 


7 


14 


1 


110675 H89355 


Hs.6598 adrenergic PMIIdown 207 


334 


237 


239 


231 


220 


119 


145 


93 


64 


56 


124 


115412 AA283804 


Hs.193552 ESTs PM12down 146 


316 


282 


271 


340 


334 


115 


238 


100 


196 


83 


207 


115844 AA430124 


H&234607 MDM2 PM13down49 


93 


94 


154 


132 


91 


23 


54 


23 


76 


14 


41 


120588 AA281591 


Hs.16193 ESTs PM14down80 


157 


58 


141 


159 


127 


39 


83 


35 


37 


16 


46 


132349 Y00705 


Hs.181288 serine pro PM15down146 


217 


214 


180 


106 


128 


177 


85 


54 


63 


66 
41 


56 


132888 AA490775 


Hs5920 N-ecetytma PM16down 92 


150 


132 


178 


126 


139 


53 


94 


48 


67 


60 


132967 AA032221 


Hs.61635 STEAP PM17down224 


208 


203 


215 


205 


180 


132 


65 


68 


50 


48 
29 


63 


133063 AA283085 


H&64065 ESTs PM18down85 


148 


161 


150 


92 


108 


42 


99 


42 


65 


126 


134374 062633 


Hs.8238 ESTS PM19down230 


240 


194 


212 


231 


189 


89 


123 


107 


95 


68 


91 


135400 M23263 


Hs.99915 androgen r PM20down36 


187 


99 


178 


132 


101 


23 


71 


26 


122 


14 


44 
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TABLE 11 : shows genes, including expression sequence tags that are up-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: 


Unique Eos probeset identifier number 






ExAoctc 


Exemplar Accession number, Genbank accession number 




UnkjenelD: 


Unigene number 








Unigene Tit 


ie: Unigene gene ffle 








B1: 


Background subtracted normal prostate : c 


lostate tumor fissue 




Pkey 


ExAccn 


UrdgenelD 


Unigene Title 


R1 


101338 


U9169 


Hs.75678 


FBJ murine osteosarcoma viral onoogene homolog B 


0.012 


130642 


M63438 


Hs.156110 


Immunoglobulin kappa variable 1D-8 


0.015 


133512 


X01677 


Hs.195188 


glyceralderryde-3-phosphate dehydrogenase 


0X117 


133436 


H44631 


Hs.737 


immediate early protein 


0.017 


129292 


X13810 


Hs.1 101 


POU domain; class 2; transcription factor 2 


0.019 


100610 


HG2566-HT4792 




Mfcrotubule-Assodated Protein Tau, AH. Spfca'3, Exan 8 


0.O2 


133448 


M34516 


Hs.170118 


immunoglobulin lambda-Eke polypeptide 3 


0.021 


125193 


W67577 


Hafl4298 


CD74 antigen (invariant polypeptide of major histocompatibility 










complex; class II antigen-associated) 


0.022 


133456 


T49257 


Hs.183704 


ublquQnC 


0.022 


134546 


AA459310 


H&8518 


Homo sapiens mRNA; eDMA DKFZp586L1722 (from clone 










DKF2p586L1722) 


0.023 


102131 


U15085 


H&1162 


major histocompatibility complex; class II; DM beta 


0.023 


101375 


M13560 


H&B4298 


CD74 anfigen (invariant polypeptide of major histocompatibility 










complex; class U antigen-associated) 


0:023 


100674 


HQ3033-HT3194 




Sptceosomal Protein Sap 62 


0.024 


134365 


R32377 


H&82240 


syntaxbiSA 


0.027 


132335 


D60387 


H&189885 


ESTs 


0.027 


110303 


H37901 


Hs32708 


ESTs 


0.028 


131678 


N59162 


HsXS0542 


ESTs 


0.028 


116599 


D80046 


HS250879 


ESTs 


0.029 


133769 


M17733 


Hs.75968 


thymosin; beta 4; X chromosome 


0.029 


107804 


AA026648 


Hs.61389 


ESTs 


0.03 


129427 


T80746 


Hs.1 11334 


ferritin; light polypeptide 


0.03 


105987 


AA406631 


Hs.1 10299 


mhogen-activated protein kinase kinase 7 


0.03 


131466 


F03233 


H&27189 


ESTs 


0.032 


102859 


X00274 


Hs.76807 


Human HLA-DR alpha-chain mRNA 


0.032 


134626 


S82198 


H&8709 


catdecrfn (serum calcium decreasing factor; elastase IV) 


0.032 


134170 


M63138 


Hs.79572 


cathepsln D (lysosomal aspartyl protease) 


0.033 


131713 


X57809 


Hs.181125 


Immunoglobulin lambda gene cluster 


0.034 


100748 


H63517-HT3711 




Mpha-1-AntitrypsIn, 5" End 


0^)34 


118769 


N74496 




ESTs 


0X134 


111734 


R25375 


Hs.126916 


ESTs 


0.038 


109221 


AA192755 


Hs£5840 


ESTs; Weakly similar to stac [Rsaplens] 


0.036 


133846 


AA480073 


Hs.76719 


U8 snRNA-assodated Sm-Gke protein 


0X136 


135281 


AA401575 


H&97757 


ESTs 


0.037 


119073 


R32S94 


Hs.45514 


vets avian erythroblastosis virus E26 oncogene related 


0.037 


100760 


HG3576-HT3779 




Major Wstocompafibffity Complex. Class II Beta W52 


0.037 


101426 


M19483 


Hs.25 


ATP synthase; H+ transprtng; mitochndri F1 complex; beta polypept 


0XQ8 


129568 


AA428025 


Hs.1 14360 


transforming growth factor beta-stimulated protein TSC-22 


0X138 


130900 


238468 


Hs.21036 


ESTs; Moderately similar to F25965_3 [H^apiens] 


0X139 


133879 


M13829 


Hs.77183 


v-raf murine sarcoma 3611 viral oncogene homolog 1 


0.039 


100627 


HG2702-HT2798 




Serineflhreonine Kinase (Gb225424) 


0X339 


129424 


M55593 


Hs.1 11301 


matnxmetailoprotefiia^2(gelatinaseA;72kDgelatinase; ' 










72kD type IV coHagenase) 


0X139 


128652 


AA621245 


Hs.103147 


ESTs; Weakly similar to slrriar to SP:YR40_BACSU [Oelegans] 


0.039 


129979 


T72635 


Hs.13958 


ESTs 


0.039 


133468 


X03068 


Hs.73931 


major MstocompatMity complex; class II; DQ beta 1 


OXW 


102636 


U67092 




Human ataxia-telangiectasia locus protein (ATM) gene, axons 










1a, lb, 2, 3 and 4, partial cds 


0X14 


129536 


M33493 


Hs.184504 


byptasa; alpha 


0.04 


133599 


M64788 


Hs.75151 


RAP1;GTPase activating protein 1 


0.041 
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10 
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U62015 

AM12165 
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X8Q200 
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Z40883 

AA424535 

AA279481 
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Z83741 
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C02170 
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AA609878 

AA206465 
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Hs.83656 
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Hs.151531 

Hs.1 83760 

H&25300 



Hs*5717 

Hs.75765 
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Hs.120911 
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H&5038 

Hs.72242 

Hs.7891 
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Hs.12457 
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Hs.45073 
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Hs.98416 
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Hs.1 36031 
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H&234249 
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Hs.55289 
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Human atohaltXT) collagen (COL11A1) gene, 5* region and exon 1 

Homo sapiens chromosome 19; cosroU R27216 

protein tyrosine phosphatase; norweceptor type 21 

COX17 (yeast) homotog; cytochrome c oxidase assembly protein 

abha-rwtrrylacyl-CoA racemase 

Homo sapiens clone 23622 mHNA sequence 

branched chain keto add dehydrogenase E1; alpha polypeptide 



ESTs; Weakly slmDarto 0 ALU SUBFAMILY J WARNING 
ENTRY II [Rsapiens] 

cartilage oUgorrieric matrix protein (pseudoachondroplasia; 

epkihyseal dysplasia 1; multiple) 

ESTs 

himunogtobulln lambda-lite polypeptide 2 
RecQ protekHike 5 

protein tyrosine phosphatase; receptor type; S 
major histocompatibility complex; class II; DO beta 1 
Rho GOP dissociation fnhbitor (GDI) beta 
collagen; type IX; alpha 2 

Human endogenous retroviral H pnsteaseMegrase-derived ORF1 



ESTs 

DpocaGnl (protein migrating faster than aburrut tear prealbumin) 

pim-1 oncogene 

ESTs 

protein phosphatase 3 (formerly 2B); catalytic subunit; beta Isoform 

(caldneurinAbeta) 

aldolase A; tructose-blsphosphate 

Triosephosphate Isomerase 

Homo sapiens clones 24718 and 24825 mRNA sequence 
Human B-cell receptor associated protein (hBAP) alternatively 
spliced mRNA, partial 3UTR 
ESTs 

QR02 oncogene 
ESTs 
ESTs 
ESTs 

ESTs; Weakly similar to ALR [Usapiens] 
chromodomaln heBcase DNA binding protein 3 



ESTs 

Dystrophin-Associated Glycoprotein, 50 Kda, AIL Splice 2 
ESTs 



Homo sapiens done 23770 mRNA sequence 
cysteine-rich; angiogenic inducer; 61 
EST 

Homo sapiens chromosome 19; cosmid R28379 
ESTs 

ESTs; Weakly similar to similar to collagen [Celegans] 



TNF receptor-associated (actor 4 
Potassium Channel Protenv(Gb211585) 
ESTs; Weakly similar to d)393P1£2 [H^apiens] 
ESTs 

ESTs; Weakly similar to collagen alpha 1 (XVIII) chain [M.musculus] 

debdinase; todothyronlne; type III 

ESTs 

H2A hfetone family; member M 

mitogen-actKrated protein kinase 8 interacting protein 1 

ESTs; Weakly smlr to weak smlrity to ribosomal prat L14 (Celegans] 

ESTs 

ESTs; Weakly smlr to 110 KD CELL MEMBRANE GLYCOPROTEIN [Ksaplens] 
EST 

ESTs; Highly similar to Mlz-1 protein pisaplens] 
ESTs 

protein tyrosine phosphatase type fVA; member 3 

ESTs 

ESTs 
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0.044 
0X45 
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130947 
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131413 
112304 
101416 
131201 
101054 
101306 
129311 

129942 
119210 
101046 
114086 
110171 
101004 
129715 
101581 
113285 
127537 
100813 
101841 
135053 
101419 
119724 
102673 
129877 
114788 
123812 
117669 
123782 
102395 
133795 
123193 
132595 
104161 
115330 
112893 
133475 
128699 
102940 
131299 
102435 
129594 
118593 
126702 
124386 

130538 
114299 
115604 
106052 
131730 
131285 
129705 
123175 
103592 
118196 

104886 
104250 

113301 
110441 
125297 
135258 
130633 
112006 



Z41309 

R40037 

W81679 

AA482390 

R54798 

M17254 

AA426304 

K02405 

(.41 143 

T55087 

U95301 

R93340 

K01160 

238266 

H19984 

J04101 

N58479 

M34996 

T66830 

AA569531 

HG3995-HT4265 

M93107 

R77159 

M17886 



U72509 

AA248589 

AA156737 
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N39237 
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U41767 

M12529 
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AA253369 

AA456471 
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T08000 

L29217 

K03207 

X13956 

AA431464 

U51240 

R70379 

N69020 

U54602 

N27368 

M20786 

Z40782 

AA400378 

AA416947 

U05681 

AA479498 

X78706 

AA489010 

Z30644 

N59478 

AAD53348 
AF000575 

T67452 

H50302 

Z39215 

AA292423 

T92363 

R42607 



Hs.12400 

Hi21506 

H&5174 

Hs.26510 

H&26239 

Hs.45514 

H&24174 

Hs.73933 

Hs.232069 



Hs.144442 
Hs.92995 

Hs.12770 

H&31709 

K&248109 

Hs.12126 

Hs.198253 

Hs.182712 

Hs.162859 

Hs.76893 
Hs.93678 
Hs.177592 
Ha.47622 

Hs.13094 

Hs.103904 

Hs.1 11591 

Hs.44977 

Hs.162695 

H&922Q8 

Hs.169401 

Hs.136956 

Hs.155742 

Hs.7724 

Hs.88827 

Hs.194684 

Hs.73987 

Hs.103972 

H&24998 

H&25426 

Hs.79356 

Hs.1 15396 

HS207689 

H&2785 

H&212414 

Hs.159509 

H&22920 

Hs.49391 

Hs.6382 

HsX1210 

Hi25274 

Hs.12068 

Hs.178400 

Hs.123059 

Hs.48396 

Hs.144626 
Hs.105928 

Hs.13104 

Hs.19845 

Hs.159409 

HsX7272 

Hs.1 78703 

H&22241 



ESTs 
ESTs 

ifbosomal protein S17 

ESTs; Madly smlr to vacuolar prot soring homolog r-vps33b [FUwrvegicus] 
ESTs 

v-ets avian erythroblastosis virus E26 oncogene related 
ESTs 

Hunan NHC dass II HLA-OObeta mRNA (DR7 DQw2); complete cds 

T-ceQ leukemia translocation altered gene 

yb45c0ar1 Stratagene total spleen (#937205) Homo sapiens cDNA 

clone IMAGE74126 5', mRNA sequence. 

phosphoHpase A2; group X 

ESTs 

Accession not Bsted In Qenb ank 

Homo sapiens PAC clone DJ0777O23 from 7p14-p15 

ESTs 

v-ets avian erythroblastosis virus E26 oncogene homolog 1 

ESTs; Weakly similar to LR8 [H haptens] . 

major histocompatibility complex; class II; DQ alpha 1 

ESTs 

ESTs 

Cpff€nrtohed Dna, Clone S19 

3-hydroxybutyrata dehydrogenase (heart; mftocfandrfai) 

ESTs 

ribosomai protein; large; PI 
ESTs 

Human alternatively spliced B8 (87) mRNA, partial sequence 

ESTs; Weakly similar to ORF YGR101W (S.carevisiae] 

EST 

ESTs 

ESTs 

EST 

a disirrtegrin and metaBbproteinase domain 15 (metargidin) 

apotpoprotain E 

ESTs 

glyoxylate reduciase/hydroxypyruvate reductase 

K1AA0963 protein 

ESTs 

bassoon (presynaptic cytomatrix protein) 
CDC-tke kinase 3 



Hu 12S RNA induced by polytri); poly(rC) and Newcastle disease virus 
ESTs; Weakly similar to unknown [Rsapiens] 
Lvsosornakrssodated muftispanning membrane protain-5 
Human germ&ie IgD chain gsne; C-region; C-deRa-1 domain 
EST 

keratin 17 

sema domain; immunoglobulin domain (1g); short basic domain; 

secreted; (samaphorin) 3E 

alpha-2-plasmln Inhibitor 

sMar to S68401 (cattle) glucose induced gene 

ESTs 

ESTs; Highly similar to KIAA0612 protein (Rsapiens] 
B-<^CLUlymphoma3 

ESTs; Modly smlr to putative seven pass transmembrane prot (Hsapiens] 

carnitine acatyltransferase 

ESTs 

chloride channel Kb 

ESTs; Moderately similar to tumor necrosis factor-alpha 
•Induced protein B1 2 [H-saplens] 
growth differentiation factor 1 1 

leukocyte irrtmunoglobutin-llke receptor; subfamily B (with TM 

and rrtM domains); members 

EST 

ESTs; Highly smlr to prat phosphatase 2A BR gamma subunit (Hsapiens] 
ESTs 

ESTs; Weakly similar to <U281H8.2 [H.sapiens] 
ESTs 
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124530 
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132793 
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121828 
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130153 

124403 
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112136 
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102069 
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123427 
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134791 

133700 

123887 
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105719 
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117437 
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134437 
107664 
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101574 
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103495 
129S07 
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128841 
100515 
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134516 
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103575 
115514 
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133912 
129581 
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AA4014S2 

AA026793 

AA425166 
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N46244 
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N31745 
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U15460 
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R46100 
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U09193 

AA455000 

AA491226 

Ml 66837 
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AM77106 
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AA219179 

J04444 
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K01396 

AA621065 

H05704 

AA291644 

HS2395 

N27645 

AA394133 

M26041 

AA010594 

AA349417 

M34182 

C0O476 

Y09022 

AA404594 

AA450040 

T16358 

HG1723+1T1729 

T54095 

AA171939 

X73608 

Z26256 

AA297739 

AA321355 
H55992 
X62744 
M33600 



Hs.170238 
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HS52699 
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Hs.17409 
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H&32060 

Hs58679 

H&98497 

Hs.172727 

Hs.110373 

Hs.15114 

Hs.102493 
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Hs.168625 

HsJ1833 

H&44532 

H&250640 

Hs.41691 

Hs.181131 

Hs.9739 

Hs.178543 

Hs.82520 

Hs.16725 

Hs.105280 

Hs.72620 

Hs.1 11076 

Hs.1 1081 

HsX*7562 

Hs.6147 

Hs.1 10757 

Hs.1 12471 

Hs.19105 

Hs.697 

Hs.89655 

Hs.75821 

Hs.1 12943 

Hs.1 10746 

HS36793 

Hs.190266 



Hs55698 

Hs.1 88253 

H&5326 

Hs.96917 

Hs.1 58029 

Hs.24395 

Hs.153591 

Hs.1 1607 

Hs.154162 

Hs.106443 



H&23413 
Hs.93029 

HsS5609 



Hs20495 
Hs.77522 
Hs.180255 



sodium channel; vottaga-gated; type I; beta polypepfida 
MAA01BO protein 

ESTs; Moderately similar to Idnesin tight chain 1 (Hmuscuhis] 
neurochondrin 



EST 

ESTs; Moderately shflarto UV-1 protein [Rsaplens] 
WAA0906 protein 

lyrnphotoxln beta receptor (TNFR superfamDy; member 3 

cystehe-rtch protein 1 (Intestinal) 

cysteine and glydrtwteh protein 3 (cardiac UM protein) 

ESTs 

ESTs; Weakly similar to 4F2CD98 Gght chain [Mjnusculus] 
ESTs 

Treacher Coffins-FranceschetO syndrome 1 
ESTs 

ras hornolog gene family; member D 

ESTs 

ESTs 

K1AA0979 protein 
ESTs 



KsapfensmRNA for C0152 protein - 
sequence-specific single-stranded-DNA-binding protein 
activating transcription factor B 
ESTs 
ESTs 

Immunoglobulin mu 

Hul.1 kb mRNA upregltd in ratinoic add treated HL-60 rteutropMic ceils 
ESTs 

ESTs; Weakly sMar to dJ963K232 [H.sapiens] 
DKFZP434l114proteln 

malate dehydrogenase 2; NAD (mitochondrial) 
ESTs; Weakly similar to HPBR1I-7 protein (H.sapiens] 
ESTs 

WAA1075 protein 

DNA segment on chromosome 21 (unique) 2056 expressed sequence 
ESTs 

transtocasa of Inner mitochondrial membrane 17 (yeast) hornolog B 
cytochrome c-1 

proteh tyrosine phosphatase; receptor type; N 
protease inhibitor 11 
ESTs 

H sapiens HCR (a-helix coifed-coD rod homologue) mF 
ESTs 
ESTs 

ywSe3.s1 Weizmann Olfactory Epithefium H sapiens cDNA clone 

lMAGE2556763'smlr to contains L1.t3L1 repetitive element ;, mRNA. seq 

ESTs; Highly similar to OASIS protein [M.musculus] 

major hlstocompatibity complex; dass II; DQ alpha 1 

ESTs; Moderately similar to pim-1 protein (H sapiens] 

ESTs 

protein kinase; cAMWependent; catalytic; gamma 

small Inducible cytokine subfamily B (Cys-X-Cys); member 14 (BRAK) 

Not56 (P. meIanogaster)-6ke protein 

ESTs 

ADP-ribosylafton factor-like 2 
ESTs 

Macrophage Scavenger Receptor, Alt. Spfice 2 

ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY 1! [Hjsapiens] 

ESTs 

sparc/bsteonectin; cwcv and kazaHke domains proteoglycan (lesfican) 
Haptens isoform 1 gene for L-type calcium channel, exon 1 
ESTs; Weakly similar to ISOLEUCYL-TRNA SYNTHETASE; 
CYTOPLASMIC [Hsaplens] 

EST2393 Bone marrow Homo sapiens cDNA 5" end, mRNA sequence 
DKFZP434F011 protein 

major histccornpatMrly complex; dass II; DM alpha 
major histccompaMy complex; dass II; DR beta 1 
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BCS1 (yeas! homologHite 



ESTs 

transcription elongation factor A (Sll); 2 

slte-1 protease (subfiHsWite; sterol-reguiated; cleaves sterol regulatory 



ESTs 
KWA0128 protein 

mWchromosome maintenance deficient (S. ceravlslae)4 
ESTs 

zm7c8.s1 Stratagene neuroepBhellum (#937231 ) Homo sapiens cONA 
dona IMAGE3399 3", mRNA sequence 
KIAA0422 protein 

ESTs; Weakly stater to predicted using Genefinder [Celegans] 
ESTs 

RuvB (E ooB homologj-fika 2 

ya94a02.s1 Stratagene placenta (#937225) Homo sapiens cONA done 

IMAGB69290 3*. mRNA sequence. 

ESTs 

VGF nerve growth factor indudbla t 

phospholipase A2; group IVC (cytosofic; catefunvindependant) 

H sapiens DAT! gene, partial, VNTR 

BgasellfcDNA;ATP-dependent - 

ESTs; Highly similar to CQI-69 protein |Rsap!ens] 

IMP (tnosine monophosphate) dehydrogenase 1 

ESTs 

ESTs 

steroidogenic acute regulatory protein related 
ESTs 

ESTs; Highly similar to CGI-38 protein (H .sapiens] 

ESTs; Weakly sMIai to mitogen Inducible gene mig-2 [ftsaplens] 

ESTs; Weakly simflar to T20B12.3 jOebgans] 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens unknown protein mRNA, partial ods 
prepranociceptin 

TYRO protein tyrosine kinase binding protein 
EST 

dimelhylargiriine dimethylaminohydrolase 2 



Human garraria-aminoburyric add transaminase mRNA, partial cds 

brgtycan 

ESTs 

ESTs 

EST; Moderately similar to CGI-1 36 protein [Usapiens] 

ESTs 

ESTs 

ESTs; Weakly similar to F42C5.7 gene product [Celegans] 

immunogloburin gamma 3 (Gm marker) 

estrogan-responsive B box protein 

ESTs 

ESTs 

Wiskott-Aldrich syndrome (ecezema-thrombocytopenla] 

phosphohuctoklnase; muscle 

K1AA1058 protein 

ribosomal protein S12 

ESTs 

ESTs; Highly similar to HSPC013 [H.saplens] 

Human amiloride-sensitive epithelial sodium channel gamma subunR mRNA, 

5* end, partial cds 

ADP-noosyltransferase (NAD+; poly (ADP-rrbose) porymerase)-like 1 
DKFZP727G051 protein 

BRCA1 associated proteirt-1 (ubiqulrjn cajboxy-terrninal hydrolase) 

ESTs 

ESTs 

ESTs 

ESTs 



0.064 
0X64 
0064 
0054 

0j064 
0X64 
0.064 
0.064 
0.064 

Oj064 
0X64 
0X64 
0064 
0X64 

0X65 
0.065 
0X65 
0X65 
0X65 
0X65 
0.065 
0X65 
0X65 
0.065 
0.065 
0X65 
0.065 
0.066 
0X65 
0X65 
0J065 
0.065 
0X66 
0.066 
0.066 
0X68 
0.066 
0X66 
0.066 
0.067 
0X67 
OJ067 
0.067 
0.067 
0.067 
0X67 
0X67 
0.067 
0.067 
0X67 
0.067 
0.067 
0.067 
0.067 
0X67 
OJ066 
0X68 

0X68 
0X68 
0X68 
0X68 
0X68 
0X69 
0X69 
0X69 
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132905 


U70863 


Hs.182965 




105778 


AA348910 


Hs.153299 




134770 


R72079 


H&89575 




123097 


AA485869 


Hs.105671 


5 


100750 


HG3523-HT4899 






125091 


T91518 






100756 


HG3565W3768 




10 


113483 


T87768 


Hs.16439 




101119 


L09708 


Ha2253 




102288 


U31628 


Hs.12503 




135349 


D83174 


H&9930 


i ' e 
15 


100991 


J03764 


H&82085 


133675 


AA443720 


ns.7D0i 




105422 


AA251014 


Hs.12210 




102932 


X13334 


Hs.75627 




119147 


R58878 


HsXJ5739 




104900 


AA055O48 


Hs.180481 


20 


133185 


AA481404 


H&6686 




115498 


AA290674 


Hs.71819 




121005 


AA398332 


H&97813 




124869 


R69088 


H&28728 


25 


129154 


N23673 


Hs.108969 


112161 


R48295 






125251 


W87488 


Hs.141464 




134298 


J00116 


Hs.81343 




119745 


W70264 


Hs.58093 


30 


131308 


AA232686 


Hs.25489 




107776 


AA018820 


Hs.221147 




134271 


M199630 


Hs.184458 




101793 


M85220 






135402 


S76942 


Hs.99922 


35 


118742 


N74052 


H&50424 




131867 


N64656 


Hs.3353 




102923 


X12517 


Hs.1063 




100775 


HG371-HT26388 






111020 


N54361 


Hs.1 85726 


40 


134224 


X80822 


Hs.163593 




124059 


F13673 


Hs.99769 




133972 


AA160743 


Hs.78019 




129681 


AA436009 


Hs.178188 




103065 


X583S9 


H&81221 


45 


124966 


T19271 


Hs.155560 




112270 


R53021 


Hs203358 




116704 


F10183 


Hs.66140 




129890 


M13S99 


Hs.1 11481 




127345 


AA972008 


Hs.166253 


50 


112436 


R63090 


H&28391 




114531 


AA053033 


HS203330 




135122 


H99080 


H&94814 




103934 


AA281338 


Hs.134200 




109383 


AA215369 


Hs.185764 


55 


112647 


R83329 


H&33403 




127083 


Z44079 


Hs.91608 




133027 


AA402624 


Hs.63238 




122088 


AA432121 


H&250986 




110405 


H47542 


H&33962 


60 


128697 


AB002344 


Hs.103915 




112221 


R5O360 


Ha25670 




100478 


HG1067-HT1067 






115598 


AA400129 


Hs.65735 




132491 


AA227137 


Hs.4984 


65 


101655 


M60299 






106018 


AA41 1887 


Hi34737 




123683 


W05348 


Hs.158198 




134137 


F10045 


Hs.79347 




114008 


W89128 


Hs.19872 



KruppeHikB factor 4 (gut) 0XB9 

DOM-3 (C. alegars) honiolog Z 0XB9 

CD79Ban^8n(ImmunoglobuDMSSOciat«)bela) 0.069 

ESTs 0-069 

ProtoOnoogenBC^IycAR. Splice 3, Off 114 0XX39 
ye20f0&s1 Stratagem lung (8937210) H sapiens cDMA done IMAGE: 
3 similar to contains Alu repetiSva element^orrtains MER12 repetitive element; 

mRUA sequence. 0-069 

Zinc Finger Protein (Gb:M88357) O069 

ESTs 0-069 

compbrnerrt component 2 0.069 

Interleukto 15 receptor; alpha 0.07 

a^Qagan-blnclIns proton 2 (cotffgan 2) 0.07 

plasminogen activator inMbtor; type I 0X17 

ESTs; Weakly similar to T25G3.1 [Celegans] 0.07 

ESTs 0XJ7 

CD14 antigen 0-07 

ESTs 0j07 

ESTs; Weakly similar to ACROSIN PRECURSOR [Rsaplens] 0-07 

ESTs 0.07 

eukaryotic translation WSatton factor 4E binding protein 1 0.07 

ESTs 0-07 

ESTs; Weakly similar to F55A123 (Celegans] ~ 0.071 

mannoskiase; alpha; dass 2B; member 1 0X171 

ESTs; WHy smlr to D ALU SUBFAMILY J WARNING ENTRY II [Hsapiens] 0X171 

ESTs 0.071 
coDagan; type II; alpha 1 (primary osteoarthritis; spondyloepiphyseal 

dysplasia; congenital) 0.071 

ESTs 0-071 

ESTs 0X171 

ESTs 0.071 
ESTs; WWy smlr to 11 ALU SUBFAMILY SX WARNING ENTRY II [risapians] . 0.071 

Accession not Bsted in Genbank 0.071 

dopamine receptor 04 0X171 

EST 0XJ71 

Homo sapiens done 24940 mRNA sequence 0.071 

small nuclear ribonudeoproteln polypeptide C 0X172 

Mucin 1 , Epahelial Aft. Splice 9 0.072 

ESTs 0.072 

ribosomal protein L18a 0072 

ESTs 0.072 

Homo sapiens done 24432 mRNA sequence 0X172 

ESTs; Weakly sfmOar to WASP-famfly protein [ttsapiens] 0.072 

Human L2-9 transcript of unrearranged (mrmmogtobutin V(H)5 pseudogerta 0X172 

catnexin 0.072 

ESTs 0X172 

EST 0X172 

ceniloplasmln (ferroxldase) 0X172 

EST s; Highly similar to WAA0476 protein [H.sapierts] 0X172 

ESTs 0X172 

ESTs 0X172 

ESTs 0X172 

Homo sapiens mRNA; cDNA DKFZp564C186 (from done OKFZp564C186) 0X172 

ESTs; Weakly similar to hypothetical protein [H^apiens] 0X172 

ESTs 0XJ73 

otofertn 0X173 

synudain; gamma (breast cancer-specific protein 1 ) 0X173 

EST 0- 073 

ESTs 0X173 

KIAA0346 protein 0.073 

ESTs 0.073 

Mucin (Gb:M22406) 0.073 

ESTs 0X173 

K1AA0828 protein 0X173 

Human atpha-1 collagen type II gene, exons 1 , 2 and 3 0X173 

ESTs * 0.073 

DKFZP434B103 protein 0.073 

K1AA0211 gene product 0.073 

ESTs 0X173 
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107653 AA010210 HS47041 

104798 AA029462 Hs.17235 

134032 116991 Ks.79006 

119160 R80413 HsS2520 

5 107741 AA016982 H&64341 

133683 AA335223 Hs.75558 

111694 R22035 H&23331 

120764 AA338729 Hs.133096 

119389 T88826 HSJ90973 

10 100929 HG688-HTS88 

119388 T88798 

133019 AF009674 Hs.184434 

105185 AA191495 Hs.189937 

133413 S72043 Hs.73133 

15 101017 J04599 HsXfcl 

132865 K02765 Hs251972 

110882 N36001 Hs.17348 

129197 T90303 Hs.109308 

101184 L19871 HS/460 

20 134910 AA431320 HSJ9100 

119411 T96621 H&203656 

102000 U01824 K&380 

114691 AA121893 Hs.103779 

134179 U53204 Ha.79706 

25 134503 U34880 H&84183 

129719 N68396 Hs.167766 

113916 W80464 H&31928 

113897 W73926 Hs.4947 

30 129697 R00841 Hs.172069 

112078 R44155 Hs.112218 

121980 AA429886 Hs.1 10407 

100898 HG4638-HT5050 

121626 AM16974 H&98174 

35 133670 AA243416 Hs.75470 

131879 AA017161 H&33792 

100254 D38037 Hs.77643 

133194 AA291726 Hs.67201 

106081 AM18394 H&25354 

40 115544 AA351433 Hs.66187 

119955 W87460 Hs£8989 

104407 H61361 Hs.102171 

135019 X58431 H&98428 

114815 AA161488 Hs.103931 

45 119471 W31352 H&55445 

117788 N48292 H&46849 

119406 T95064 Hs.193771 

130777 R61742 Ha256554 

130494 L13197 Hs.75874 

50 104107 AA424111 Hs.12598 

121483 AA411981 H&25274 

104451 M13299 Hs.102119 

118027 N52770 Hs.75968 

109419 AA227560 H&86987 

55 115783 AA424487 Hs.72289 

110585 H62223 Hs.133526 

123165 AA488863 Hs.105216 

103966 AA303166 Hs.127270 

109549 F01528 Hs21192 

60 106730 AA465520 K&22313 

120310 AA193676 Hs.1 18926 

104078 AA402801 Ha222010 

117624 N35978 Hs.82364 

112421 R62441 Hs23127 

65 106958 AA497026 H&22059 

129984 W92811 Hs.183927 

122044 AA431456 HsJ8736 

123280 AA491285 Hs.175144 

115710 AA412535 Hs£5235 



ESTs 0X173 

ESTs 0.073 

deoxylhymidylateWnase 0X173 

ESTs 0X73 

ESTs 0X173 

pepsinogen 5; group I (pepsinogen A) 0.073 

ESTs 0X173 

ESTs 0X173 

ESTs 0X174 

Major HistooompatbiHy Complex, Class H, Dr Beta 2 (Gb:X65561) 0.074 

plasniiiogenaefivatorWilbilofitypel 0.074 

axin 0.074 

ESTs 0X174 

matailoWorain 3 (growth kihlbitofy factor (neurotrophic)) 0X174 

blglyean 0.074 

complement component 3 0.074 

ESTs; WUy smlr to D ALU SUBFAMILY SQ WARNING ENTRY II [Usapiens] 0X174 

ESTs; Wkty smlr to leudne-rfch giloma-inactivatad prot precursor [ILsaplens] 0X174 

activating transcription factor 3 0X175 

ESTs 0.075 

EST 0.075 

solute carrier famfly 1 (glial high affinity gtutamata transporter); member 2 0.075 

ESTs; Weakly similar to envelope protaln [H-sapfens] 0X175 

plecfin 1 ; Intermediate Harnant binding protein; 500kD 0.075 
dipfherta toxin resistance protein required for diphfhamide 

biosynthesis (Saccharomyces)-Iike 1 0.075 

ESTs; Moderately similar to Pro-a2(XI) [lisaptens] 0.075 

ESTs; WWy smlr to alternatively spliced product using axon 13A [lisaptens] 0.075 

ESTs 0.075 

OKFZP434C212 protein 0.075 

ESTs 0.075 
ESTs; Weakly slrriar to coded for by C. elegans cDNAyk173c12£ [C.elegans] 0.075 

Spliceosorral Protein Sap 49 0.075 

ESTs 0.075 

hypothetical protein; expressed In osteoblast 0X175 

ESTs 0X175 

FK506*hdlng protein 1B (12.6 kD) 0.075 

ESTs 0X175 

ESTs 0X175 

Homo sapiens clone 23700 mRNA sequence 0X176 

ESTs 0X176 

immunoglobulin superfamily containing leucine-rich repeat 0X176 

Human Hox2Xtgena for a homeobox protein 0X176 

DKFZP434B0335 protein 0X176 

ESTs 0X176 

ESTs 0X176 

EST 0X176 

ESTs 0X176 

pregnancy-associated plasma protein A 0X176 

T-cell lymphoma invasion and metastasis 2 0X176 

ESTs; Modly smlr to putative seven pass transmembrane prot [H^apiens] 0X176 

blue cone pigment 0X176 

thymosin; beta 4; X chromosome 0X176 

receptor-Interacting serine-Bireonine kinase 3 0.076 

ESTs; Weakly similar to UV-1 protein [Usapiens] 0.076 
ESTs; Wkty smlr to II1ALU SUBFAMILY SB1 WARNING ENTRY !i![Hsap!ens] 0.076 

ESTs; Weakly smlr to I1ALU SUBFAMILY J WARNING ENTRY It [H .sapiens] 0X177 

ESTs 0.077 

Homo sapiens done 25155 mRNA sequence 0X177 

ESTs 0X177 

DKFZP586K091 9 protein 0.077 

ESTs 0.077 

ESTs 0.077 

ESTs 0.077 

ESTs 0.077 
ESTs; Weakty similarto !! ALU SUBFAMILY J WARNING ENTRY H [Hjsaplens] 0.077 

EST 0.077 

ESTs 0.077 
sphingomyelin phosphodiesterase 2; neutra 
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134129 


D87444 


Hs.79305 


1 membrane (neutral sphlngorrryefinase) 
KIAA0255 gene product 


129321 


AA224502 


Hs206501 


Homo sapiens done 643 unknown mRNA; complete sequence 


130513 


AA460257 


Ks.15866 


ESTs 


100996 


J03909 


Hs.14623 


Interferon; gamma-lnducible protein 30 


128358 


A1095718 


Hs.135015 


ESTs 


128544 


R59352 


Hs.1 19273 


KIAA0296 gene product 


106040 


AA412681 


Hs.125139 


ESTs 


106495 


AA452113 


H&32454 


ESTs; Moderately similar to KIAA0544 protein (Hxaplens] 


131833 


R40899 


Hs32973 


glycine receptor; beta 


119219 


R97176 


Hs.1 10783 


ESTs 


135415 


X60655 


H&99967 


evan-sklpped homeo box 1 (homolog of Drosophila) 


109457 


AA232646 


Hs.68061 


ESTs; WeaWy similar to sphlngoslrie kinase [Mmiscutus] 


117137 


H96670 


Hs.42221 


ESTs 


107094 


AA609614 


HS5241 


ESTs 


130165 


T90529 


Hs251613 


EST 


124072 


H05252 


Hs.101637 


EST; Weakly similar to hypctoeScal protein [Ksaplens] 


126151 


AA324743 


H&40808 


ESTs 


119035 


R01779 


Hs.7740 


ESTs 


110157 


H18987 


Hs.169731 


ESTs 


128515 


AA149044 


Hs.10086 


ESTs; Highly amSar to HYPOTHETICAL PROTEIN KWA0195 [H^apiens] 


133069 


U94838 


Hs.6430 


protein with pdyguitarnha repeat 


112209 


R49644 


H&24865 


ESTs ~ 


133361 


R28279 


Hs.71848 


Human done 23548 mRNA sequence 


134714 


U89922 


Hs.890 


rymphotoxin beta (INF superfamBy; member 3) 


129905 


TB6798 


Hs.132875 


ESTs; Weakly sWar to predicted using GeneSnder [Cetegans] 


120421 


AA238166 


Hs.132957 


ESTs; Wealdy similar to chondromodufin-l precursor [H.sapiens] 


100885 


HG44904fr4876 




ProBne-Rich Protein Prb4, Allele 


102789 


U86759 


Hs.158336 


netrin2(ch!cken)-like 


120139 


Z39273 


Hs.77876 


Human DNA from chromosome 19-specfflc oosmld R30923; genomic sequence 


135238 


U76343 


Hs.96970 


Human Bver GABA transport protein mRNA; 3* end 


129618 


N54845 


Hs.173030 


ESTs 


132960 


AA609742 


Hs.6150 


K1AAQ521 protein 


108751 


AA127063 


HS503717 


ESTs 


134060 


D42039 


Hs.78871 


KIAA0081 protein 


111338 


N79778 


Hs.35094 


extracellular matrix protein 2; female organ and adipocyte specific 


112345 


R56880 


Hs.26563 


ESTs 


126456 


W00881 




za56d02/1 Soares fetal Bver spleen 1NFLS Homo sapiens cONA done 
IMAGE296S47 S, mRNA sequence. 


128937 


Z39939 


Hs.10726 


ESTs 


103485 


Y08409 


Hs.248415 


thyroid hormone responsive SPOT14 (rat) homolog 


111202 


N68280 


Hs.107922 


ESTs 


132625 


AA429890 


Hs.166066 


dspfatfn resistance associated 


103434 


X98085 


H&54433 


tenasdn R (resHcfln; januskv) 


102616 


U6S581 


Hs.159191 


ribosomal protein L3tte 


102667 


U70867 


H&83974 


soiuta carrier family 21 (prostaglandin transporter); member 2 


111422 


R01127 


Hs.19104 


ESTs 


101411 


M16938 


H&820 


homeo box C8 


113267 


T65058 


Hs.12725 


ESTs; Wealdy similar to D ALU SUBFAMILY J WARNING ENTRY 0 [Rsapiens) 


103559 


Z19585 


Hs.75774 


mrornbospondin4 


131588 


AA258613 


Hs291B9 


KIAA1021 protein 


107821 


AA020991 


Hs.172856 


ESTs 


134278 


H82839 


Hs.81001 


ESTs; Wealdy similar to DY3j6 [C.elegans] 


120893 


AA369300 


HsH7058 


EST; Highly similar to CMP-N-acetylneuraminic add hydroxylase [H.sapiens] 


108786 


AA128999 




zo8f1Zs1 Stratagene neuroepllhelium NT2RAMI 937234 Homo sapiens 
cDNA done IMAGE567119 3, mRNA sequence 


106890 


AA489245 


H&88500 


KIAA1066 protein; JSAP1 homolog (mouse); JIP3 homolog (mouse) 


119760 


W72267 


Hs£8219 


ESTs 


132999 


Y00787 


Hs.624 


interleukin 8 


129156 


AA028195 


Hs.108973 


dolichyt-phosphale mannosyltransferase polypeptide 2; regulatory submit 


121171 


AA400008 


Hs.161814 


ESTs 


103864 


AA207264 


Hs.181077 


ESTs; Weakly similar to MiHer-Dieker Bssencephaly gene (H sapiens] 


128591 


AA255537 


Hs.102057 


ESTs; Wealdy similar to O-Snksd GtcNAc transferase (Haptens) 


122172 


AA435753 


Hs.161854 


EST 


112802 


R97647 


Hs.174855 


EST 


107723 


AA015967 


Hs.60680 


EST 


113011 


T23737 


Hs.1600 


chaperontn containing TCP1; subunit 5 (epsllon) 


131279 


AA089853 


Hs.25197 


STIP1 homology and U-Box containing protein 1 


103190 


X70083 


Hs.58414 


lilamln C; gamma (actin-binding proteln-280) 



0j077 
0.077 
0.078 
a078 
0.078 
0.078 
0X178 
0.078 
0X178 
0XJ78 

oxttb 

O078 
0XJ78 
0JJ7B 
0.078 
0XJ78 
0.078 
0X178 
0X178 
0X178 
0X178 
0X178 
0X178 
0X178 
OXI78 
0X179 
0JJ79 
0X179 
0X179 
0X179 
0X179 
0.079 
0XJ79 
O079 
0X179 
0.079 
0XJ79 

0.079 

0.079 

0.079 

a079 

0X179 

0X179 

0.079 

0.079 

0.079 

0.08 

0X3 

0X18 

0X18 

0X8 

0.08 

0X8 

0.08 

0.08 

0.08 

0.08 

0.08 

0.08 

0.08 

0.08 

0.08 

0.08 

0.08 

0.081 

0.081 

0X81 
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103356 


AA2S2411 


H&233348 


112706 


R89828 


Hs.138493 


128126 


M85370 




130094 


H43288 


Hs.167017 


100300 


HG3945-HT4215 




108675 


AA115240 


Ks£1816 


129420 


AA234259 


H&99816 


129668 


M77349 


Hs.1 18787 


101645 


M59807 


Hs343 


130536 


T17045 


Hs.159492 


107732 


AA016181 


H&59752 


123071 


AA482593 


Hs.104285 


113537 


T90457 


Hs.191293 


101250 


L34060 


Hs.79133 


122521 


AA449433 


Hs.149227 


133914 


N32811 


Hs.77542 


102038 


U05659 


H&477 


110336 


H40338 


Hs.174094 


118637 


N70274 


Hs.49822 


117986 


N51589 


HSS4012 


104424 


K87671 


Hs.182320 


100361 


D78361 


Hs.125078 


112974 


T17291 


Hs.101174 


132832 


063482 


H&57734 


132039 


Z39489 


Hs3781 


113272 


TB5383 


Hs.12807 


104924 


AA058532 


H&28774 


111061 


N58054 


H&36859 


129269 


R45977 


Hs.163593 


102453 


U4B437 


Hs.74565 


126204 


A1080388 


Hs.134296 


116615 


D80666 


Hs.45203 


128856 


AA219552 


H&204144 


112776 


R95850 


HS34494 


105494 


AA256273 


Hs29288 


117000 


H84718 


Hs.1 12236 


112656 


R85260 


nS.133151 


128963 


J03890 


Hs.1074 


116957 


H79292 


H&39960 


101057 


K03430 




121948 


A A JAM J PA 

AA42S452 


HSXJB58Z 


130822 


M80647 


H&2001 


122743 


AA458674 


Hs.99478 


114569 


AA083316 




132270 


U70671 




108126 


AA052951 


Hs.47413 


102880 


X04325 


HS2679 


115385 


AA282089 


Hi88599 


114529 


AA052980 


H&206704 


135017 


AA249586 


Hs.9315 


123776 


AA610071 


Hs.1 12813 


114454 


AA021091 


HSJ226208 


101246 


L33799 


H&202097 


107366 


U78310 


Hs.13501 


132779 


T89601 


HsU5497 


129709 


Ml 12209 


Hs.1209 


115244 


AA278767 


Hs514 


123253 


AA490878 


Hs.1 11334 


128469 


T23724 


HS258677 


132220 


AA431847 


Hs.42409 


111664 


R17939 


H&22344 


102354 


U38268 




112828 


R98774 


Hs.1 94338 



ESTs 
ESTs 

EST01 884 Fetal brat), Stratagene (cat#936206) Homo sapiens cONA 

done HFBCH10, mRNA sequence. 

ganima-aiTMHityriB add (GASA) B receptor; 1 

Phospholipid Transfer Protein 

ESTs 

ESTs 

transforming growth factor, beta-Induced; 68kD 

natural WHereeB transcript 4 

spastic ataxia of Oiartevoix-Saguenay (sacsln) 

ESTs 

ESTs 

ESTs 

c&dhsrin 8 

ESTs; Weakly similar to PROUNE-RICH PROTEIN MP-3 [Mjnuscutus] 
ESTs 



ESTs; Weakly simBarto 1! ALU SUBFAMILY J WARNING ENTRY 0 [H.sapiens] 

ESTs 

ESTs 

ESTs; Weakly similar to Mouse 1 95 mRNA; complete cds [Mmuscutus] 

Human mRNA tor ornithine decarboxylase anfigrnajORFI andORF2 

inlcronibukMSSOclated protein tau 

WAA0148 gene product 

Homo sapiens BAG clone RG118D07 from 7q31 

ESTs 

ESTs 

ESTs 

ribosomal protein L18a 

amyloid beta (A4) precursor-Tike protein 1 

ESTs 

ESTs 

ESTs; Modly smlr to tumor necrosis factor-alpha-induced prot B12 [Usapiens] 
ESTs 

Homo sapiens mRNA; cONA DKFZp434P174 (from done DKFZp434P174) 

ESTs; Weakly similar to repressor protein [H^aplens] 

transient receptor potential channel 7 

surfactant; putrnonary-assodatad protein C 

ESTs 

Human complement Clq B-chatn gene, axon A+1 
ESTs 

thromboxane A synthase 1 (platelet; cytochrome P450; subfamily V) 
EST 

zm2d1 51 Stratagene corneal stroma (8937222) Homo sapiens cDNA clone 
fMAGE5129473 , simaartDTR-E198281 E198281 THIOREDOXIN 
REDUCTASE contains Atu repetitive element;, mRNA sequence 
ataxin 2 related protein 
ESTs 

gap Junction protein; beta 1 ; 32kO (connenn 32; Charcot-Marie-Tocth 

neuropathy, X-linked) 

ESTs 

ESTs 

ESTs; Weakly similar to NEURONAL OLFACTOMEDIN-RELATEO 

ER LOCALIZED PROTEIN [H sapiens] 

ESTs 

ESTs 

procollagen C-endopepSdase enhancer 

pescadilb (zebiafish) homotog 1; containing BRCT domain 

ESTs; Weakly similar to GLUCOSE TRANSPORTER TYPE 5; 

SMALL INTESTINE [Rsapiens] 

acyl-Coenzyme A dehydrogenase; long chain 

Human mRNA for SB dassll histocompatibility antigen alpha-chain 

ferritin; Bght polypeptide 

EST 

ESTs; Highly similar to CGH46 protein [lisaplens] 
ESTs 

Human cytochrome b pseudogene, partial cds 
ESTs 



0X181 

(X081 
0.081 
0081 
O081 
0XB1 
O081 
0X181 
0X181 
0081 
0XB1 
0.081 
0.081 
0X181 
0X181 
0X181 
0.081 
0X181 
0X182 
0X182 
0X182 
0X182 
0XS2 
0X182 
OJ382 
0X182 
0X182 
O082 
0X182 
0X182 
0X182 
0X182 
0XB2 
0X182 
0.082 
0.082 
0.083 
0X183 
0X183 
0X183 
0X183 
0X183 



0X183 
0X183 
0X183 

0.083 
0.083 
0X183 

0X183 
0.083 
0.083 
0.083 
0.O83 

0.083 
0.083 
0.083 
0.083 
0.083 
0.083 
0.083 
0.084 
0.084 
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110410 


H47868 


H&34024 








102550 


U58087 


Hs.14541 


108417 


AA075716 




113299 


T67285 


Hs.13089 


117869 


N49947 


Hs46990 


113734 


T98484 


Hs.18377 


133325 


C00424 


Hs.7101 


123368 


AA505022 


Hs.124838 


101615 


M55153 


HsX265 


119352 


T65972 


Hs.193365 


123828 


AA620686 


Hs.1 12884 


103811 


Z38133 


Hs.1 13973 


131289 


AA485697 


H&25334 


128878 


T15896 


Hs.103535 


130814 


AA256695 


Hs.19813 


133391 


X57579 


Hs.727 


129322 


AA437153 


Hs.1 10407 


109284 


AA196995 


H&86092 


116689 


F09222 


Hs.66099 


100545 


HG2147-HT2217 




102634 


U6S711 


Hs.77667 


111735 


R25389 


H&23856 


105181 


AA190878 


Hs.10974 


122681 


AA455350 


Hs.99401 


114543 


AA056121 


Hs.158419 


133597 


AA425908 


Hs.75139 


121084 


AA398647 


Hs.97406 


122231 


AA436369 


Hs.197728 


100309 


D50550 


Hs.85659 


101727 


M73481 


Hs.73883 


131226 


AA165400 


H&24476 


133580 


AA095041 


Hs.181073 


102792 


U87984 


H&227576 


104976 


AA086480 


Hs.183669 


120865 


AA350631 


Hsfl6963 


106080 


AA418046 


H&35124 


128571 


AA416619 


Hs.101661 


101838 


M92934 


Hs.75511 


128514 


H84281 


Hs.100843 


123099 


AA485931 


Hs.79 


134067 


Y08200 


H&78920 




noUoiio 




110053 


H12S86 


H&89563 


114395 


AA007313 


Hs.1 10155 


107485 


W44681 


HS251385 


101983 


S85655 


Hs.75323 


112544 


R70948 


Hs29153 


111423 


R01165 


Ha188507 


127918 


AA806043 


Hs.1 15396 


107300 


T40348 


Hs.90488 


134947 


R51194 




124579 


N68345 


Hs.127179 


130471 


Z68280 


Hs.183706 


116596 


D60755 


Hs.92955 


105069 


AA136345 


H&23617 


102491 


U51010 




130069 


AA055896 


Hs.146428 


130234 


AA280413 


Hs.157441 


120540 


AA2S2992 


HS56417 


122508 


AA449221 


H&20432 



ESTs 

Human clone W2-6 mRNA from ctaomosome X 
cu&nl. 

zm89e5.s1 Stratagene ovarian cancer (8937219) H sapiens cONA dona 

IMAGE54512 2 similar to gb:X14723 CLUSTERIN PRECURSOR 

(HUMANE mRNA sequence 

ESTs 

ESTs 

EST 

periodontal ligament fibroblast protein 
ESTs 

transglutaminase 2 (C polypeptide; orotelr^glitamhe 



ESTs; Moderately similar to alternatively spSced product 

using axon 13A [Usapiens] 

EST 

myosin; heavy polypeptide 8; skeletal muscle; perinatal 

ESTs; Weakly similar to ION CHANNEL HOMOLOG RIC 

PRECURSOR [M-muscutus] 

ESTs 

ESTs 



ESTs; Weakly similar to coded far by C. etegansrcONA yk173c125 [Cjelegans] 

ESTs 

ESTs 

Mucin 3, Intestinal (Gb*t55405) 

lymphocyte antigen 6 complex; locus E 

ESTs; Weakly simBar to FAST kinase [Haptens] 

ESTs; Moderately similar to unknown [Rnorvegicus] 

EST 

ESTs 

partner of RAC1 (artaptin 2) » 
ESTs 

ESTs; Weakly similar to ZINC FINGER PROTEIN 135 [H^apians] 

lethal giant larvae (Drosophlla) homolog 1 

gastrin-re leasing peptide receptor 

ESTs 

ESTs 

GTP binding protein 1 

ESTs; Weakly similar toll ALU SUBFAMILY J WARNING ENTRY II [Rsapiens] 

EST 

ESTs 

ESTs 

connective tissue growth factor 

ESTs; Weakly similar to stm3ar to GTP-binding protein [Oelsgans] 
amhoacyiasel 

Rab geranylgeranyftransf erase; alpha subunit 
EST 

nuclear cap binding protein 1;80kO 
ESTs 

murine retrovirus intagrafion site 1 homolog 

prohibitin 

ESTs 

ESTs 

Human germCne IgD chain gene; C-fegton; C-delta-1 domain 
ESTs 

yj71a08Jl Soares breast 2NbHBst Homo sapiens cDNA done IMAGE1 54166 

5* sirrflar to gbll 1284 DUAL SPECIFICITY MiTOGEN-ACTIVATED PROTEIN 

KINASE KINASE 1 (HUMAN); mRNA sequence. 

ESTs; Weakly similar to TERATOCARCINOMA-OERIVED GROWTH 

FACTOR Ifftsaplens] 

addudn 1 (alpha) 

ESTs 

ESTs; Weakly similar to ZFOC1 gene product (H sapiens] 

Human nicotinamide N-methyltransfarase gene, exon 1 and 6* flanking region 

collagen; type V; alpha 1 

spleen focus forming virus (SFFV) provtral integration oncogene spil 

ESTs 

ESTs 



0j084 
0j084 
0.084 



0.084 
Oj034 
OJ084 
0X184 
0j084 
0j0B4 

0X84 

0.084 
0.084 
0.084 

0.084 
0X84 
0.084 
0.084 
0.084 
0.084 
0.085 
0X85 
0.085 
0.085 
0X85 
0X85 
0.085 

aoss 

0.085 
0.085 
0.085 
0.085 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0X85 
0.088 
0.088 
0X86 
0X86 



0X86 

0X86 
0X86 
0X86 
0.086 
0.086 
0X86 
0X36 
0X86 
0X86 
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128054 AI205718 Hs.125416 ESTs 0088 

133020 AA0S3248 Hs.185182 ESTs; Highly similar to 40S RIBOSOMAL PROTEIN S10 [Rsaptens] O08S 

130056 AA017358 Hs.171900 arrradilo repeat gene dalatas In valocan&facfal syndrome 0.086 

130504 U48865 Hs.158323 CCAATtenhanoer binding protein (C/EBP); epsilon 0.086 

5 133978 W73859 Hs.78061 transc*fiontactQf21 O086 

105265 AA227841 Hs£6088 ESTs 0X188 

133035 T15965 Hs.6333 ESTs 0X186 

100768 HG3636-HT3846 Myosin, Heavy PolypeptJda 9, Non-Musde 0XB6 

129338 T56800 Hs.47274 Homo sapiens mRNA; cDNA DKFZp564B178 (from done DKFZp564B176) 0.086 

10 132789 W23761 Hs56876 ESTs 0.086 

116099 AA456309 Hs58831 regulator of Fas-lnducad apoptosis 0X188 

100721 HQ3355-HT3532 Peroxisome ProSferator Activated Receptor (Gb230972) 0X187 

112569 R73150 Hs.75270 GTP-binding protein homologous to Saccharomyces eeravlslae SEC4 0X587 

130645 AAQ20942 Hs.17200 STAMiSte protein containing SH3 and ITAM domains 2 0087 

IS 100751 HG3527-HT3721 Luteinizing Hormone, Beta Subunft 0087 

. .134550 M27161 Hs.85258 C08 anflgen; alpha polypeptide (p32J 0.087 

130885 AA338646 H&20912 adenomatous polyposis cod lika 0.087 

101446 M21302 Hi56306 smaOproQne-rtch protein 2A 0X187 

116287 AA4S7856 Hs.155829 KIAAQ676 protein 0X187 

20 134034 X89267 Hs.78601 uroponihyrhoflen decarboxylase O087 

130860 U66061 Hs241395 protease; serine; 1 (trypsin 1) 0XB7 

109901 H04992 H&30499 ESTs O087 

107537 Z20777 Hs.9857 ESTs; Weakly similar to peroxisorrdsriortdtaift alcohol 

dehydrogenase [H^aplens] 0X187 

25 133232 AA496030 Hs£845 ESTs 0X187 

108559 AAD85161 zn12c5.s1 Stratagene hNT neuron (#837233) H sapiens cONA dote 

IMAGE54728 3 1 similar to TftG1151228 G1 151228 LPG1P. ; mRMA seq 0X187 

121288 AA401735 HsX)7340 EST 0X187 

108844 AA132916 Hs.177961 Human Chromosome 1 6 BAG dona CIT987SK-A-388D4 0X187 

30 129874 AA406488 Hs.181551 ESTs 0X187 

N 105139 AA164543 Hs.1 10082 ESTs 0X188 

124789 R43803 Hs.78110 ESTs; Weakly similar to F1 7A92 [C.elegans] 0X188 

115923 AA441829 HsH8205 ESTs 0.088 

123640 AAS09292 Hs.1 12681 ESTs 0X188 

35 131607 AA351409 Hs.172740 microtubule-associated protein; RP/EB family; member 3 0X188 

130064 T67053 Hs.1 81 125 Immunoglobulin lambda gene duster 0X188 

108752 AA127070 Hs.71055 ESTs 0X188 

124249 H66077 Hs.108211 ESTs 0X188 

100109 AJ000480 Hs.143513 phosphoprotein regulated by irfflogenicpabways 0.088 

40 104642 AA004662 Hs.1 84245 KIAA0929 protein Msx2 interacting nudear target (MINT) homobg 0X188 

131752 AA453311 Hs31566 ESTs O088 

114727 AA132545 Hs.190202 ESTs 0X188 

120965 AA398089 Hs.179715 ESTs 0X188 

100396 084361 Hs.1 51 123 Human mRNA tor p52 and p64 fsofrxms of N-Shc; complete cds 0XB8 

45 106218 AA428451 HsS1146 OKFZP586E0820 protein 0X188 

111562 R09567 Hs.187569 ESTs 0X188 

121219 AA400606 Hs.1 44344 EST 0X188 

101187 L20316 HsXM8 glucagon receptor 0X188 

101513 M28210 H&27744 RAB3A; member RAS oncogene fandy 0088 

50 116454 AA621071 Hs.42034 ESTs; Moderately similar to T-complax protein 10A |H.sapiens] 0X188 

116171 AA463434 Hs.42658 ESTs 0.089 

117500 N31909 Hs.44278 ESTs 0X189 

119978 W88623 Hs.59190 EST . 0X189 

132005 D58231 Hs.173091 DKFZP434K151 protein 0.089 

55 109914 H05529 Hs.194704 leudne-rtch; glioma Inactivated 1 0X189 

130370 M55265 Hs.155140 casein Idnasa 2; alpha 1 polypeptide 0X189 

104262 AF009801 Hs.1 05941 bagp'pa homeobox (Drosophlla) homolog 1 0X89 

129708 AA417181 Hs.120858 ESTs 0X189 

106398 AA447545 Hs.18268 adenylate kinase 6 0X189 

60 120884 AA365356 HsX*7041 ESTs 0X189 

130404 X72012 Hs.76753 endoglin(Oslef-Rend*Wetorsyi»diome 1) 0X189 

114072 Z38184 Hs.123633 ESTs 0.089 

131470 X54938 Hs.2722 Inositol 1;4;5-trisphosphat8 3-klnase A 0.089 

124573 N67935 Hs.1 94703 adaptor-related protein complex 4; mu 1 subuna 0X189 

65 114717 AA131240 Hs252014 EST 0089 

133806 M12759 Hs.76325 Human lg J drain gana 0.09 

130470 AA398552 Hs.15711 KIAA0639 protein a09 

133182 Z80787 Hs.240135 H4hlslone family; member J 0.09 

116036 AA452572 Hs.43866 ESTs 0.09 
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132404 
122695 
125975 
110783 
129860 
120740 
119564 
134474 
119014 
109791 
117605 
121589 
104326 
129861 
102795 
119626 
110516 
105382 
123754 
108008 
121057 
123675 
135194 
127070 
134051 
133382 
103615 
118457 
118504 
112915 
132088 
101504 
112550 
128551 
112879 
127079 
101993 
113020 
120465 
130152 
104941 
110090 
135375 
123799 



125147 
100836 
114726 
107311 
112863 
129290 
103384 

112508 
111863 
131184 
107420 
111768 
112290 
130581 
120744 
112226 
116154 
102640 
129797 
102705 
132408 
108441 



AA333903 

AA456048 

AA495B91 

N23G69 

AA410343 

AA3Q2650 

W38206 

AA054746 

N95435 

F10669 

N35073 

AM16627 

081655 

N69507 



W4S4S9 

H56894 

AA236853 

AA609984 

AA039430 

AA398619 

AA60S474 

C20975 

AA641812 

S6707O 

AA112532 

Z46967 

N66S93 

N67334 

T10176 

AA470121 

M27288 

R71391 

H09058 

T03541 

AI384691 

U01062 

T23830 

AA251505 

U32645 

AA065169 

H16076 

AA480888 

AA620418 

N93438 

H80833 

W3B150 

HGW113-HT4383 

AA132509 

T57738 

T03148 

AA521407 

X92762 

R68213 

R37495 

AA452705 

W26567 

R27606 

R53940 

AA481982 

AA302772 

R50761 

AA460951 

U67674 

X53595 

U77180 

AA035547 

AA079079 



Hs.4768 

H&99403 

Hs.152290 

H&26407 

Hs.129826 

H&96654 

H&8379 

H&55144 

Hs.13228 

H&44433 

Hs.191593 

Hs.143067 

Hs.129849 

Hs.198395 

Hs.184456 

Hs37368 

Hs.111801 

Hs.102021 

H&61920 

Hs.142375 

Hs.112713 

H&S613 

Ks.190037 

Hs.78846 

Hs.7247 

Hs.115460 

K&4S230 

H&50158 

Hs.4254 

H&243960 

K&248156 

H&29074 

H&237323 

Hs.1 15960 

Hs.128628 

Hs.77515 

Hs.7303 

H&130861 

Hs.151139 

Hs.17805 

Hs.6915 

H&99741 

Hs.112861 

Hs.76907 

Hs.143038 



Hs.103827 
Hs.174112 
Hs.4610 
Hs.1 10095 
Hs.79021 

H&28847 

H&23578 

H&23954 

Hs.4775 

H&24185 

H&26016 

Hs.16258 

K&228649 

Hs25738 

Hl57100 

Hs.194783 

Hs.1252 

H&50002 

Hs.47822 



ESTs 

ESTs; Moderately slrrfiar to undulin 2 [H^aplans] 

ESTs; Highly similar to PACAP type-3/VlP type-2 receptor [Rsapiens] 

ESTs 



EST 

Accession not listed in Ganbank 

ESTs 

ESTs 

DRE-antagonist modulator; calsen&n 

ESTs 

ESTs 

ESTs 

DKFZP564M182 protein 

ATP-binding cassette; sub-family A (ABC1); member4 

ESTs; WHy smtr to II ALU SUBFAMILY SX WARNING ENTRY II [H sapiens] 

EST 

Homo sapiens mRNA; cONA DKFZp564H2023 (from done DKFZp564H2023) 

ESTs 

ESTs 



EST 



ESTs 

haat shock 27kO protein 2 

ESTs 

cafcin 

EST 

ESTs 

ESTs 



ortcostatinM 
ESTs 

N-acetyigtucosamine-phosphate mutasa; DKFZP434B187 protein 
ESTs 

ESTs; Moderately slniar to CL3BC [FLnorvegicus] 
inositol 1 ^^rphosphate receptor; type 3 
ESTs; Weakly similar to PROHIBITS [H^aplens] 
ESTs 

E74-EkB (actor 4 (els domain transcription facta) 

ESTs 

ESTs 

ESTs; WeaMy similar to BRAIN PROTEIN H5 [H .sapiens] 
ESTs 

ESTs; Highly similar to HSPC002 ptsapiens] 
ESTs 

Accession not listed In Genbank 

Olfactory Receptor Or17-201 

EST 

ESTs 

EST 

ESTs 

lafazzin (cardiomyopathy; dilated 3A (X-linked); endocardial 

fibroelastosis 2; Baith syndrome) 

ESTs 

ESTs 

EST s; Weakly similar to KIAA0584 protein (Ksapiens] 

ESTs 

ESTs 

ESTs 

ESTs; Weakly simter to RAS-RELATED PROTEIN RAB-5A [Usaplens] 

EST 

ESTs 

ESTs 

solute carrier family 10 (sodkinVbDe acid cotransporter family); member 2 

apotlpopratoin H (beta-2-gtyocproteIn I) 

small Inducible cytokins subfamily A (Cys-Cys); member 19 

KIAA0380 gene product; RhoA-specffic guanine nucleotide exchange factor 

zm97c9.s1 Stratagene colon HT29 (1937221) Homo sapiens cONA done 



0.09 

0X9 

0X9 

0.09 

0.09 

0X9 

0X9 

0j09 

0.09 

0.09 

0X9 

0X9 

0X9 

0X9 

0X9 

0X9 

0X9 

0X9 

0X9 

0X9 

0X91 

0X91 

0X91 

0X91 

0.091 

0.091 

0.091 

0.091 

0X91 

0X91 

0.091 

0X91 

0.091 

0X91 

0X91 

0.091 

0.091 

0.091 

0.091 

0.091 

0.091 

0.091 

0.091 

0.092 

0.092 

0X92 

0.092 

0.092 

0.092 

0X92 

0.092 

0.092 

0X92 
0X92 
0X92 
0.092 
0X92 
0.092 
0.092 
0.092 
0.093 
0.093 
0.093 
0.093 
0.093 
0.093 
0.093 
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108145 
106466 
101697 
121294 

117824 
115771 
102303 
131405 
112909 
124173 
112488 
130554 
106413 
111711 
117595 
113313 
107769 

114966 

130297 
109589 
112592 
102314 
116128 
106809 



AA054133 
AA449990 
M643S8 
AA401958 

N49065 

AA422049 

U33053 

U79255 

T10069 

H41281 



X59303 

AA447964 

R22891 

N34933 

W45174 

AA018449 

AA250743 

HS4949 

F02429 

R77631 

U34038 

AA459915 

AA479704 



Hs.63085 
Hs.76057. 

HS540170 

Hs.125201 

Hs.40780 

H&2499 

H&26468 

Hs.101094 

Hs.107619 

H&28788 

Hs.159637 

Hsj6311 

Hs.7093 ' 

H&44684 

H&31382 

Hs.125220 

Hs.92198 

Hs.171955 

H&65B1 

H&29126 

Hs.154299 

Hs.1 12193 

HS220324 



IMAGE545872 3 similar to contains element MER22 MER22 repefitive 

element „ mRNA sequenoe 

ESTs 



Human rtiom-3 gene, exon 
ESTs; I 

exon 13A[H .sapiens) 
ESTs; Weakly strnBar to B7 (Mjnuscutus) 
ESTs 

protein kinase C-Ske 1 

amyloid beta (A4) precursor protein-binding; family A; member 2 (XI 1-fike) 

ESTs 

ESTs 

ESTs 



ESTs 
ESTs 
EST 
ESTs 

Homo sapiens DNA from chromosome 19-cosmlds R301 02fl29350ft27740 
containing MEF2B; genomic sequence 



CIWSP-24fRsap"ens] 



ESTs 
ESTs 



coagulafion factor II (thrombin) receptor-lite 1 
mutS (E. co5) homotog 5 

Human DNA sequence from done 283E3 on chromosome 1p3621-36 .33. 
Contains the altemafively spliced gene for Matrix Metalloprotainase in the 



0.093 
0X193 
0.093 
0.093 

Oj093 
0.093 
0.093 
0.093 
0.093 
0.093 
0.093 

om 

0.093 
0X83 
0.094 
0.094 
0X194 

0X194 

0X194 
0.094 
0.094 
0.094 
0.094 
0.094 









the attemaSvely spliced CDC2L2 gene for 


0X94 


130607 


AA043894 


Hs.16603 


ESTs 


0.094 


120592 


AA281929 


Hs.143974 


ESTs 


0.094 


117230° 


N20535 


Hs.43265 


metastaflnl 


0.094 


105948 


AA404597 


Hs.7133 


ESTs 


0J094 


101333 


L47738 


H&80313 


p53 trtdudbla protein 


0:094 


101909 


S69265 . 




Homo sapiens mRNA for PLE21 protein; complete cds 


0.094 


106959 


AA497031 


H&8657 


ESTs; Highly similar to CTG7a pisapiensj 


0X194 


127034 


AA352389 




ESTs; Wkly smlr to glucose-6-phosphatase catalytic subunit [Rnorveglcus] 


Oj095 


134430 


H52105 


H&8309 


KIAA0747 protein 


0.095 


120342 


AA207105 


Hs.45068 


Homo sapiens mRNA; cDNA DKFZp434l143 (from clone DKFZp434l143) 


0XH5 


104450 


L77564 


Hs.103978 


serfne/threonlne kinase 22B (spermiogenesis associated) 


0X195 


130902 


AA424530 


H&21061 


ESTs 


0X195 


102708 


U77594 


H&37682 


refrnoic add receptor responder (tazarotene induced) 2 


0X195 


107373 


1185773 


Hs.154695 


pfrosphomannornutass 2 


0X195 


123569 


AA608952 


H&195292 


ESTs; Weakly stmiar to RNA hellcase HDB/DICE1 [H^apiens] 


0X195 


102687 


U73379 


H&93002 


uMqufiln carrier protein E2-C 


0X195 


128888 


AA0349S1 


Hs.106893 


ESTs 


0.095 


100283 


D43642 


H&2430 


transcription factor-like 1 


0X195 


102747 


U79303 


H&82482 


protein predicted by done 23882 


0j095 


107798 


AA019346 


Hs-60918 


EST 


0X195 


123565 


AA608907 


Hs.1 12614 


EST 


0X195 


116010 


AA449450 


K&56421 


ESTs; Weakly similar to Similarity to HMuenza ribonudease PH [Celegans] 


0X195 


117155 


H97536 


H&42391 


EST 


0X195 


133094 


AA1 15572 


Hs.64746 


chloride Intracellular channel 3 


0.095 


113174 


T54659 


Hs.9779 


ESTs 


0X195 


102016 


U03270 


Hs.122511 


centrin; EF-hand protein; 1 


0.095 


130126 


AB002318 


Hs.150443 


KIAA0320 protein 


0.095 


134813 


X14767 


Hs.89768 


garrana-arnlnobulyric add (GABA) A receptor; beta 1 


0.095 


132055 


N69440 


HsJ8132 


ESTs 


0.095 


122229 


AA436198 


Hs.103902 


ESTs 


0.096 


127574 


AA907314 


Hs.188905 


ESTs 


0.098 


134432 


AA053022 


H&8312 


ESTs 


0.096 


128052 


AA878398 


Hs.190491 


ESTs 


0.096 


101637 


M58285 


Hs.132834 


hematopoietic protein 1 


0X196 


103386 


X92972 


Hs.80324 


protein phosphatase 6; catalytic subunit 


0XB6 


133079 


AA477561 


Hs.6449 


ESTs 


0.096 


120328 


AA196979 


Hs.104129 


ESTs; Weakly similar to protease [H sapiens] 


0.096 
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107640 


AA009815 


H&257808 


ESTs 




123389 


AA521176 


H&221231 


ESTs 




103222 


• X74795 


Hs.77171 


rruruchrornosorne maintenance aetcsern [$. cetevistaej o \ceii o (vision q/qq *k>j 


V.U80 


111704 


R22450 


H&23396 


ESTs; Highly similar to ZINC RNufcn pku i on i4u [n.sapi8nsj 




126858 


AA306523 




EST 1 77475 JUTKBt f -OGilS VI nCfno Sapiens cunh o era, mriPiH sequerm. 




127071 


AA250803 




ESTs 


A AAA 


114550 


AA056755 


Hs.151714 


ESTs 


onofi 

UiUaD 


125955 


AB56943 


Hs.143761 


ESTs 


a oqc 


134363 


M37033 


H&82212 


CD53 antigen 


a noft 


128550 


W76492 


Hs.170142 


ESTs 


O OCR 


122598 


AA45346S 


HsJ9329 


ESTs 


0.090 


118898 


N90703 


H&4236 


KIAA0478 gene product 


ft aog 


117661 


N39092 


Hs.44940 


ESTs 


0.09b 


120996 


AA398281 


Hs.143684 


ESTs 


Oj096 


123388 


AA521172 


Hs.134417 


ESTs 


ft AOO 


106700 


AA463929 


Hs28701 


ESTs 


ft AfiO 

OjOSo 


112962 


T16814 


H&6828 


ESTs 


A AOC 


121262 


AA401372 


H&97723 


ESTs 


A AOfl 


134551 


R44S39 


H&8526 


H)Bta-13-N-ac«ty^luco3amlnyfiransfarase 


0.096 


112060 


R43754 


H&21164 


ESTs 


0.0S6 


134678 


AA039935 


Hs.182595 


dyneh; axonemar; light polypeptide 4 


A AO ft 

0.09O 


100855 


HG4234OT4504 




Methyferatetrahydrofolate Reductase 


a ao*7 


132414 


N91193 


H&48145 


ESTs 


a ao7 
U.U9/ 


112900 


T08758 


H&3813 


ESTs 


A Afl7 

0.097 


115989 


AA447777 


H&83135 


ESTs 


A Afl7 
0.097 


103561 


221488 


Hs.143434 


contact'n 1 


A ACT 


131087 


AA009738 


H&22824 


ESTs; WeaHy similar to p160 rnyb-bindlng protein (Rmuscukis] 


A AAT 
0.09/ 


120293 


M1908S9 


Hs.191428 


ESTs 


0.097 


111830 


R36081 


H&25085 


EST 


A A07 


113654 


T95770 


Hs.17666 


ESTs 


A A07 


132675 


AA179338 


H&5476 


serine proteinase Inhibitor 


A AQ7 


120182 


Z40125 


Hs.91968 


ESTs 


0.097 


132879 


U16282 


H&5881 


ELL gene (11-19 lyslne-rich leukemia gene) 


A A07 


134211 


AA056681 


H&80021 


ESTs; Weakly similar to 62D9.p [D/nalanogaster] 


A AQ7 


115448 


AA284845 


Hs.165051 


ESTs 


A AQ7 


118118 


N56901 


Hs.47995 


ESTs 


0.097 


107598 


AAD04528 


Ks.169444 


ESTs 


A AQ7 


128933 


H01824 


Hs.760 


QATA-binding protein 2 


A AQ7 
U.U3/ 


114892 


AA235988 


H&86024 


ESTs 


A AA7 


101922 


S75168 


H&274 


megakaryocyte-assodated tyrosine kinase 


A AQ7 


105444 


AA252374 


Hs.19333 


ESTs; Weakly similar to ATP(GTP)-binding protein [H .sapiens] 


A A07 


128155 


AA926843 


Hs.143302 


ESTs 


A AD7 


116276 


AA485870 


Hs.44914 


ESTs 


A AQ7 


111964 


R41227 


Hs£1860 


ESTs 


Aftfl7 

0.097 


135100 


AA398926 


H&251108 


Homo sapiens mRNA; chromosome 1 speciSc transcript K1AAD493 


A AQ7 


124872 


R692S1 


Hs.101506 


EST 


A AQ7 
0AJ97 


103084 


X59932 


Hs.77793 


c-src tyrosine kinase 


0.097 


124138 


H23199 


Hs.107010 


ESTs 


A ACQ 

0.090 


130048 


R31745 


H&211612 


SEC24 (S. cerevislae) related gene family; member A 


A AOO 


100208 


D26129 


Hs.78224 


ribonudease; RNase A family; 1 (pancreatic) 


Um90 


123537 


AA608775 


H&1 12589 


ESTs 


A AOS 


118999 


N95019 


Hs55092 


ESTs 


noon 


119847 


W80384 


Hs.9853 


ESTs 


A AOO 


112819 


R38618 


Hs-35984 


ESTs 


A AQQ 
U.U90 


131080 


J05008 


Hs.2271 


endotheEn 1 


A AOO 

U.W30 


127353 


AA190853 


HS.155S60 


ESTs 


A AOQ 
U.U90 


132068 


X66365 


nS.004Cl 




0.098 


105744 


AA293436 


Hs.12909 


ESTs 


0.098 


133680 


M92357 


Hs.101382 


tumor necrosis (actor; alpha-Induced protein 2 


0.098 


122899 


AA4699S0 


Hs.178420 


ESTs; Highly similar to WASP Interacting protein [H-sapiens] 


0.098 


128700 


U59286 


Hs.103982 


small inducible cytokine subfamily B (Cys-X-Cys); member 1 1 


0.098 


104393 


H46486 


H&226499 


rtesca protein 


0.098 


123320 


AA496792 


Hs.139572 


EST 


0X198 


129169 


N31641 


Hs.109058 


ribosomal protein S6 kinase; 90kD; polypeptide 5 


0.098 


135093 


U51333 


Hs.159237 


hexoJdnase 3 (white cell) 


0.098 


113269 


T65159 


HSJS044 


ESTs 


0.098 


124283 


H86783 


Hs.194136 


ESTs; Moderately simSar to zinc finger protein RIN ZF [Rjwrvegteus] 


0.098 


114376 


GMCSF 




Accession not listed in Genbank 


0.099 


100881 


HG4458-HT4727 




Immunoglobufin Heavy Chain, Vdjc Regions (Gbl23563) 


0.099 
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116572 


D45654 


Hs£5582 


123958 


AA621747 


Hs.1 12847 


100818 


HG401frHT428B 




132754 


W47419 


H&56007 


112741 


R93080 


Ha35035 


112748 


R93289 


Hs.1 66492 


130858 


S57235 


H&246381 


124870 


R69233 


Hs.101504 


125304 


Z39833 


Hs.124940 


121297 


AA401995 


H&97860 


128602 


AA046103 


Hs.102367 


124062 


H00440 


Ks.144524 


100547 


HG2149+IT2219 




105652 


AA282S05 


Hs.19015 


133390 


AA459S45 


Hs.72660 


133503 


M33195 


Hs.743 


109461 


AA232667 


H&58210 


102068 


U09117 


Hs.80776 


113464 


TB6931 


Hs.16295 


104240 


AB002368 


Hs.70500 


121113 


AA399109 


Hs.161813 


122896 


AA469952 


H&97899 


102405 


U43148 


H&159526 


103599 


Z33905 


HsH1218 


121079 


AA393719 


Hs.14169 


115820 


AA427487 


H&39619 


125106 


T95766 


Hs.189760 


131373 


N6B116 


H&26146 


120224 


Z41239 


Hs.106960 


133090 


AA448228 


Hs.6468 


132300 


M133244 


Hs.44234 


113129 


T49384 


H&8988 


110638 


H73197 


Hs.17241 


131364 


R53255 


H&26010 


105370 


AA236476 


H&22791 



DKFZP586C1324 protein 
EST 

Opfoid-Binding Cell Adhesion Molecule 

Human DNA from chromosome 19-spectflc cosrnkf F25955; genomic sequence 

ESTs 

ESTs 



ESTs 

GTP-blndlng protein 

ESTs 

ESTs 

ESTs; Weakly similar to signal transducer and activator of 
transcription 2 (M jnusculus] 
Mucin (Gb:M57417) 
ESTs 

K1AA0585 protein 

Fc fragment of tgE; high affinity I; receptor for; gamma polypep&le 
ESTs 

phosphoGpase C; delta 1 
ESTs 

KIAAO370 protein 
ESTs 

ESTs; Weakly similar to daE; Ien343; CAI: 0.17fALC_YEAST P25335 

ALLANTOICASE [SxerevisiaeJ 

patched (Drosophila) homotog 

receptor-assoclatad protein of the synapse; 43kD 

ESTs; Weakly similar to CREB-btnding protein [lisapiensl 

ESTs; Weakly similar to RETtCULOCALBIN 1 PRECURSOR pisapiensj 

ESTs 

Down syndrome critical region gene 3 

ESTs 

ESTs 

ESTs 

EST 

ESTs 

ESTs 

ESTs; Weakly similar to transmembrane protein with EGF-fike and two 
fcdDstafin-Gka domains 1 [Rsapiens] 



O099 
0.099 

ajssa 

0.099 
O099 
OJ099 

om 

0.099 
0X89 
0.099 

osm 

0j099 
0XB9 
O099 
O099 
0X199 
0X89 
0.099 
0.099 
0.099 
0.1 

0.1 

ai 

0.1 

0.1 

0.781 

0.1 

0.1 

0.1 

0.1 

0.1 

0.1 

0.1 

0.1 

0238 
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TABLE 1 1 A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 11. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Ptey: Unique Eos probeset MenHIer number 

CAT number. Gene duster number 

Accession: Genbank acoesston numbers 



Ptey CAT number 
100610 19864J 



100674 21517.2 



108559 41469.9 

100721 19818J 

100748 41861J 

100750 15759.1 



100751 24700.1 



100760 1334.7 
100775 18179.3 



Accession 

AW161357 A1878062 AI928938 AW161097 AW161167 BE314465AA351715 F07036.AA179034 F08510 F00653 AB36671 
AA476718 AW772454 AI807703 R44253 AA976667 AI985186 AK50254 H38942 R84829 AA018724 AA001000 H85934 
AA019126 H85609 AA017000 AA339355 AW950556 D51397 AA213981 BE548002 AI056359 AA001560 AW9521 13 
AA317769 AB57477 AI857475 AW249771 AW162681 H38943 AA018628 R85885 AJ98461 3 AB34765 AI7S6172 AW157488 
AI929191 R85523 D51221 D53851 H85610 AI749674 F21582 AA323145 AA019127 AA687444 T0B745 AI699293 K29532 
AA214029 AA223658 NM.016834 X14474 R19S97 H09895 R17455 R13812 R19056 AI681231 AI590200 R37671 AA861828 
AI990023 AI935669 AW005821 AA324581 H17335 R37659 R42802 R46242 R60936 R59731 H28993 AA479907 R44570 
AI890696 AA308884 AA507078 R41274 AI365507 T16348 A1560453 F03259 F04722 T16312 AA016081 AW073061 
BE314824 W28930 R44098 R51045 

AW403342 AW248986 BE561709 AA357312 BE31 1834 BE389498 BE294887 AW732696 BE047888 AI702383 BB019155 
AI702367 BE408966 BE280458 BE313759 BE513492 BE535404 BE280258 AC005263NM_007165 121990 AW732711 
AI564920 AW249094 BE265385 AW607186 AW607346 BE005217 H27211 U46230 BE260066 BE207043 BE546782 
AW248659 
AA085228AA085161 

L40904 NMJD05O37 XS0563 AB005526 H21598 AA088517 
X06096 X05826 

BE157260 BE157265 R481 18 H43827 Z17877 AW379070 AW291778 M20605 J03253 M14206 V00568 AI860465 AW296022 
M13930 AL047400 J00120 BE018476 AW675223 T26980 F06694 R22709 R24720 H22753 AI903100 AI903094 AW937823 
X00364 D10493 K01904 K0190S K00535 L00058 AA410662 AW384760 AA304930 AI680985 XO0198 H58025 AW998901 
AV653447 N31654 AW610357 AW610369 AW862480 BE223010 AW384172 AW384219 AW384171 AW384218 AA298522 
BE140421 AW945162 AW751711 AA514409 AW747912AK14214 W87741 AA972406 AA554513 BE302087 A1249030 
AA477850 AV653129 AK81360 A1274110 W87861 AA641366 X66258 A1051600 AA877139 AA527483 AA857219 AI250782 
AAS25531 AA807892 AE7881 1 A1224033 H24033 AA593398 AW129709 R45453 N22772 AA235530 T29737 AI016409 
AI688907 AA568370 AA722760 AI539329 AA550843 AW674898 AI538452 A1538453 AI337957 AA477744 AA464600 
AI140319 AW949294 AI339781 AI828736 AA923S34 AA344094 A1278350 AA975567 AA908416 AA857170 AW023520 
R43413 R48004 F02958 A1989439 R11207 AA737307 D10493 AW950652 AI093842 AI474024 AA703369 R1 12S4 M13930 
M13930 M13930 M13930 M13930 J00120 M13930 M13930 X00354 J00120 R19507 AA639812 
N32759 N29730 N30831 N32604 N31955 AI206390 H87574 R23494 AI186215 N30036 AI741512 J00117 NM.000737 
AI453626 AA330974 AI188729 AI188604 AI188964 N30276 AI188947 AI1B8830 AI188303 AI200457 AE19166 AI192459 
AI183280 AI189275 AI188639 AI186353 AI189616 AI184224 AI130720 AI188454 AI188391 AI148857 AI192447AI209155 
AI190013AI208355 AI188721 AI189429AI189364AI18B330AI431595 AI189595AI188781 AI148647AI200022AI221552 
AI220923 AI188728 AA233034 A1189807 AI189641 AI219044 AI148774 AI200658 W71989 AI207360 AI188824 AI200559 
AI200270 AA644163 AI199943 AI151301 AI189555 AI262724 AI148590 AI148695 AI126906 AI149163 K03183 K031B9 
A118SB42 AI221014 N30608 AI18S4SS AI220865 AI188498 AI138226 AI189968 AI221019 AI138197 AI149426 A1148904 
AI186218 AI188348 AI160579 AI198460 AI149039 AI160936 AI219055 AI184784 AI221580 AI161082 AI160814 AI123896 
AI417614 AI126101 AI188872 AI149571 AI168533 AI149072 AI149467 AI131288 N30684 AI160705 AI160692 AI149559 
AE73580 AI189442 AI138448 AI149591 N27302 AA400910 AI138431 AI138435 AI128407 N3Q216 AI128296AI219589 
AI188492 AI149447 AM 68482 H95374 A12190O9N31616 A12762I6 N32233 AI291937 N30741 A1188689N27111 R23214 
AI221605 AI184348 AK00375 H94451 N26397 AI871881 AA232905 N30B33 AI220780 HS4446 N30822 H87464 R68815 
N30290 AI128424 H12587 T47334 H87631 H87156 AI219133 AI868741 AA3308S9 H86993 AA330413 H93656 N30817 
T90191 H93868 AE00054H95207T47316H95381 T49170 R00880T49171 N27381 H94107R63352T85053 AW451B99 
H95142 N30313 K94015 H88987 T28278 N29701 C18834 AA331267 AA330939 AI654493 N27073 N29831 R681 13 N30758 
R26086 N32108 K95135 AA330414 AA330978 AI219422 AI189453 AI199951 X00264 NM_O00894 AA371909 AA0S3498 
T29543 AA371971 AA372026 AA371978 AA371346 AI051683 AI186418 AI220S59 A1189068 AI219266 All 86552 AI188715 
A11491S6 

AW794626M27126 M27014 

J05S81 M61 170 T27692 M34088 M34089 AW860335 AW579047 AW610437 AW610388 AW610422 AW610473 AW579078 
AW604897 AW860163 AW579067AW852410AI816584 AW177757AW602769 AI909790 AW860331 AB09787AI909811 
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100800 24735J 



100818 19604.3 

100881 458J27 

100885 12707 3 
100898 8542.1 



102459 3556J 
126126 1630017J 
102620 16821_37 
102673 24986_6 
102675 5145_4 
102753 2226.1 
102799 34624 4 
127034 51148.2 

103522 21640.1 



127071 188097.1 

55 126456 291965.1 

119388 1762256.1 

126858 20669 1 



60 103996 224545.1 



113213 23798.1 



134947 844579.1 
129311 16078.1 



AI909813 AW845083 AI905920 AW387919 BE140766 AI909279 AW369405 AA429321 AA429320AA367451 AA847972 
AW001137 AI567905T84561 A1631295AA151351 H02932AB84S19 AA367457AW369421 AI678B48AW391803 AI6108S9 
AW192838 AI922289 AB52140 AI910233 AI479474 AW001395 AA488073 AB85760 AW130017 A1858369 AA627845 
AW081805 AA158865 AK24443 AA344985 AA569793 R72486 AB89329 AI903204 AB69893 AA641284 AI279932 AA149270 
AI697120 AA729146 AI589353 AA480087 A1923310 AA530908 AI275395 AA4250S2 AA580280 AA889527 AA158866 
AW131341 AA573028 AA877326T28335 AW951288 H04235 AA099243AA994659AJ659618 AA887919AI299297 
AW001 1 16 AW283844 AB70578 AA970828 AW572126 AA775299 AW3S9449 AW389398 AW363452 AB33677 AI870710 
AW9291 ) A1582484 AI497674 AA937028 AA88S865 L38597 AA908325 AW369432 AW026623 AA627778 A1264942 
AA932409 AI187328 AK72970 AB86098 AW440471 AW138860 AI866858 AK02528 AI926172 AW243914 AI933690 
AA996114AA536189AW(M9937A1918060AI270379A1973169AW175638AW36M13 

NM.006227 L26232 R50649 AU077024 AL008726 AA411079 R35151 BE278153 BE278139 AI459777 R8B036 Z43210 
F07326 AF052157 R17844 BE615476 TB2160 R71985 H21963 AA299158 AW368248 R43123 R50628 R70441 H27245 
H72015 R72345 R39392 AB09738 BE612778 BE613234 D521 16 D52138 D52132 D52087 D51922 D51995 051905 N34249 
K25459 AA464436 AA297350 AA297466 R81736 KQ2737 AW582505 R27523 AI834241 AW130867 W72668 W76426 
AA358363 R50262 AW473860 H52335 H43953 H21 964 T39505 AI887517 AW156925 AW839850 K02628 AW007705 
A1561008 F22392 R71279 AA995433 R50725 W24462 H71931 AA464437 AW591731 R25667R52695R50810AI560805 
AI089266 H68386 H41353 H28590 AW001860 AI141623 AA250773 Af284778 AW/511412 AW083975 AA130377 AW026047 
R50551 R81494 AI357668 AI078272 F32666 F3S981 AW304865 H43906 AA931068 R48010 A154Q217 AI017339 AI291812 
AI741954 AA458490 AI088378 AA298764 H61 168 AA358362 AA298725 AA298515 AA464148 AA443538 R43046 AA084314 
T40641 T47608 T48940 AI082477 AW470145 N92284 AI758958 AA298512 AA284586 AI597777 AA480277 AI932559 
A1869081 AA476615 AA503651 AI656024 AW168522 AI682051 AI689108 AE74592 AB20917 BE258916 BE615861 
BE28Q282 R53386 BE278255 BE278398 T47607 AA477662 H68385 

100817 19648 1 L34355 U6810 NM.000023 U08895 AA424260 AI097272 AA424162 N79764 F19290 F25278 A1479385 
AA460662 AA432059 AW016935 F25770 F32549 F36677 F33016 F35992 F36010 AW172497 AA835076 F28727AA211643 
AA453282 

U79251 AA843351 R38201 R6S461 R44908 AA683289 H17477 R37364 R52832 AW298336 AA351391 KM.0Q2545 L34774 
AA296886 AW967001 T28889 R13451 T77331 AL1 19196 All 18830 H08459 AW892812 AW905838 H17585 R52878 
BE561958 BE581728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BE514250 BE397832 
BE269598 BE559865 BE33SS81 BE560031 BES14199 BE560037 BE560454 
X07881 NM.006249 X07637 AA376715 AA376677 X07715 X07704 S80916 

BE387614R51501 AA199714 AW674779 F08178 BE269071 AA376313H08264AA380420H18785 AL042151 BE277758 

BE267438 NM.OO5850 L35013 BE540833 BE390902 BE391494 BE277459 BE335592 BE390612 BE384263 BE387779 

BE388647 BE537373 BE547158 AW409585 AW374033 AW602185 AA355725 AW577548 AW935015 AW935160 W40232 

AW938647 AW374332 AA434040 BE293488 AL138361 BE560260 AI745075 AA317980 AW949382 AB34311 AB53582 

AI831042 AQ61878 AA618606 AA729052 AI424969 AA199715 AW769374 AI828422 AW044307 AI862816 AI203583 

AW084461 AW514655 AA831883 AA290672 AA831286 AA578510 AW089965 AW150746 AA292743 H22232 AI469275 

AW439312 AA292744 AW471443 AI473989 AA593336 AA464070 AI678937 AW069451 AA970763 AA610480 AA593328 

AA464009 AA768985 AI298928 AA436600 AA464718 AA699361 D61482 DS5935 AI369591 AA470695 AI809135 AA640627 

AI568446 R51502 W45467 AI655316 AA463934 AW168609 AW518663 BE045525 Z41251 A1868091 AA908160 AI026697 

AI886259 AI612932 AA215437 AI956014 BE5410S7 BE255652 BE265878 BE394102 WZ7502 

U48936 L36592 X87160 NMJM1039 ALD36606 AL038420 U35630 AW298574 

W80551 M85370 

AA976427U66052 

AI457548U72509 

U72512 T98357 R31335 F18090 

L32961 NM.O00683 U80226 S75578 AA425061 AA429317 AI815143 AA910669 A1286022 AK86019 
U88895 U88898 AA91 6056 T03285 A1341594 AI359534 AK34031 U88897 

BE397750 AA232171 8E562900 BE384894 BE242228 BE206819 BE261742 AA296468 AW959763 BE276164 BE264109 
BE392626 BE256735 AA301453 N55872 H01676 AA292746 AA427485 AA496400 AA352389 

Y10518 Y10514 Z83935 Y10508 AK000055 Y10519 AI142012 AI681 175 BE222219 AA8S0586 BE504347 BE328064 N63044 

N51228 AI151248 AI521996 AI924777 AW375954 A1860275 W00549 AI742673 AW612288 AI763062 AA632510 A1087347 

AI088070 AI214349 AA890297 AW94156 AI698598 AA631658 AA504593 AA860733 AI266761 AW663214 AW771231 

AA639610 AI769806 AI769746 AW014326 AJ28861 1 

AA250806AA459220 

AA429212W00881 

TB8798 R92430 

AI084125 AW83773 AI479687 AI939609 AI968662 AF129507 NM.013282 AW971840 AW298508 AA744240 AA811217 
AA827671 AA81 1055 AA806567 AA488977 AA908902 AI637637 AA927056 AI870139 AW340492 AA488755 AA129794 
AA306523 AA354253 BE256277 AC053467 AWS62084 

AA321355 AW964592 R23284 H73883 R23382 N47914 C01377 H04668 AW606248 R34447 AA847136 AI684489 AI523112 
AW044269 AI379138 N29366 AA761543 N79248 AAS60845 AA768316 AI147926 AI718599 AI880620 R67467 AI216016 
AI738663 H04648 

NM.001395 Y08302 AI434619 AM70328 AE61807 AW024965 AI806537 A1830549 A1640337 AI219065 AW271700 
AW028488 A1133339 AI859205 R5117S U87167 BE379324 BE392008 AA340819 AA3431 10 T57275 D59164 AVK99312 
AM34422 AI93S390 AW024975 R40262 

AW269126 R09430 T55S90 AI367247 A1253132 BE464248 T58658 AW207785 T58607 
R51 194 AI732276 R53587 AI820697 

AK000526 BE550084 W30689 AW271859 AA41 1456 AB41551 AA242990 AA243027 H87048 D20360 AI184053 AA146956 
AI721023 AI718944 AA146955 F18215 AA903890 AI700355 AI075430 AA411584 AA878210 AI476760 AW945637 AA630596 
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AA431522 AA301989 AB09058 012149 N41960 BE222214 AA6C9922 AA82B176 AA393359 AA338693 AW024956 
BE467805 AW298623 AW264085 AHJ24454 A1Q24719 AM31927 T35Q87 A161 1014T54920 AA131253 AI436344 

114427 9724_2 AAQ17176 AI359979 AA047836 AA017063 AA016303 AA001545 

114569 110077.1 AA063315 AA033316 

100108 15621_-5 ARMS910 

100515 342_1 AA305746 D90187 T63943 AW951 154 T29182 AI734941 D13264 A1299239 Z18812 AW299859 W24476 AA933064 

AA489759 

100531 46038J AWB88554 AW607282 AA319986 M28590 

100545 22955J1 M55405AW752552 

100574 17320.2 AA325895 M10036 NM.000365 N84665 K69414 N84S57 AA380453 AA329743 AA357367 Ml 88770 AA376532 AA353653 

AA158953 AA083176 BES37313 AA181433 D53373 R57376 AA206698 R14807 HI 8899 H11191 H93892 R25593TB1134 
N93285 AA083081 AA831789H13137AA497014AA079330AA182861 H13138W47161 R62913AA687088AA211112 
AA429237 AUB5923 AA100070 AW392898 A1566433 AA866006 AA214002 AW39286S N79454 AA197181 AB80371 
AA176501 AA737967 AI089225 F34874 AW571437 AIS20820 AA573489 AA423816 AA164917 AA458455 T47072 AI569087 
AI281656 AA730919 AI633441 AW195182 AI351622 AW24346S AI872649 AB59227 AA987941 A693770 T47073 AW779948 
AW510580 AI635626 AW627601 AA864328 AA953578 AB41418 BE222853 AE41963 AI094663 AA928380 AA493373 
AW043762 AB77783 AW958987 BE819760 AA385240 BE277975 BE280095 AWB31443 AA581048 BE818715 BE299610 
C14874 BE559858 BE378455 BE618290 BE544585 AI525575 BE548897 BE2671 10 AA804738 BE269821 AA918133 
BE277647 AA599947 BE280735 BE390239 N74150T12504 AI208197 AW955527 AA113897 N40081 H73835 H70393 
AM34041 W22950 AH92661 BE264461 W26486 AA628424 AA196894 T69209 AA857976 AI540287 AA410599 AAS64287 
AW950564 AA013320 T49283 AI541438 AW804703 AA335534 AA335659 BE5622S9 BE618802 BE277850 BE546413 
BE280994 AA204813 BE561694 BB43524 BE253S47 AW001452 W191 16 BE542508 AA205894 BE25487S BE270033 
AI52S906 BE251792 AA975700 BE272138 AW607671 N87688 MI0036 BE515060 BE298607 AI745178 U47924 H03193 

100627 flgr_HT2798 Z25424 

100756 figr_HT3768 M88357 

100768 fiflr_HT3B46 129141 M69180M81105 

100813 t!gr_HT4265 L33999 

100836 figrJfT4383 U04688 

100855 figr_HT4504 U09806 

102104 en1re*_U12139 U12139 

125091 genbanR_T91518 T91518 

100929 figr_HT688 X65561 

125147 _entre*_W38150 W38150 

102354 entre?_U38268 U38268 

102491 entre?_U51010 U51010 

102638 entnsLU67092 U67092 

118769 genbank_N74496 N74498 

101048 entre*J<D1160 K01160 

101057 ei*B2L.K03430 K03430 

108334 genbanlLAA070473 AA070473 

108417 483241J AA070853 AA075749 AA075716 

108441 genbank_AA079079 AA079079 

108786 genbanK_M128999 AA128999 

101655 entre?_M60299 M60299 

101697 entn«_M54358 M64358 

117437 genbank_N27645N27645 

101798 entre?_M85220 M85220 

101809 entre*_S69265 S69265 

103508 entre?_Y10141 Y10141 

103575 entre*_Z26256 Z26256 

119332 genbanK_T54095 T54095 

112161 genbank_R48295 R48295 

119564 NOT_FOUND_entrez_W38206 W38206 

114376 N0T_F0UND_entrez_6MCSF GMCSF 

100478 figr_HT1067 M22406 

100547 figr_HT2219 M57417 

100554 flgr_HT2324 Z11585 
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TABLE 12: shows genes, including expression sequence tags, that are down-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Ptey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnkjenefD: Unlgene number 

UnlgeneTBtK Unigene gene tflle 

R1: Background subtracted normal prostate : prostate tumor tissue 



Pkey 


ExAccn 


UnlgenelD Unlgene Title 


R1 


100522 


HG1763-HT1780 ProlacSn-tnduced Protein 


174 


130803 


IWOID3W 


Uq 1Q£fl cpmpnrtnpfln 1 
ruiaiooo ooiiiojiu^fBuii ■ 


16.785 


118068 


IM03U43 


He 14741 PCTc 
rlS.I3Y<tJ Colo 


13225 


114251 


70QOQO 




12.7 


112134 




He 7Ai1 CCTe 
f1S./410 CO IS 


'8.735 


1014.% 


H1£UQ4£ 


Hs 1189Q5 Human alkali mvnsin finht chain 3 mRNA" rnmntete cds 


8.175 


104028 


AA3D1UU4 


He 991 19ft FQTc 


8.15 


IUO.PHt 




He 17177m pCTc* Hlrrhh/ chnflar tn nmurth orroct trvfiirfhlo nana nrrvfiirt rH-Sanfanal 


7535 


103838 




Hc19fi99 CCTe 
rlS>l&0££ CO 19 


7212 


120469 




He 95009 nKF7P58fiM 1 R?4 nrnta'm 


7.175 


110279 




He 974RA FQTq 
n&£fOO*r _COI9 


6.701 


127472 


/VVQlOfO 


He 199013 FSTs 
no. 1 •3£-\j 1 0 co 1 a 


6.642 


133301 




Hs.7037 pallid (mouse) homolog; pallidin 


6.411 


102457 


ii^Ann7 


Hs2359 dual specificity phosphatase 4 


6595 


114011 


VVWJCm 


Hs.15082 ESTs 


6.15 


101249 




Hs.1 904 protein kinase C; tota 


6 


123265 


AA491209 


Hs.105265 ESTs; Weakly similar to reverse transcriptase [Mjnusculus] 


6 


119322 


T49655 


Hs241569 ESTs; Modly smlr to B ALU SUBFAMILY SQ WARNING ENTRY 1! [Rsaplens] 


5.95 


101673 


M61906 


Hs.6241 phospho!nositid9-3-kinase; regulatory subunit polypeptide 1 (p85 alpha) 


5525 


115586 


AA399218 


Hs.92423 ESTs 


5.7 


120590 


AA281780 


Ks.111441 ESTs; Weakly similar to slrrdar toKruppal-EkezincBngerproteln [Oelegans] 


5.7 


109748 


F10192 


H&248323 Tubulin; alpha; brain-specific 


6.625 


134727 


X80507 


H&8939 yes-associalad protein 65 kDa 


55 


129171 


AA234048 


Hs.7753 calumenin 


5.486 


120390 


AA233122 


Hs.111460 ESTs; Highly sariarto muRifundional (aWum/calmodulirKlependant protein 








kinase II delta2 teoform [Rsaplens] 


5.4 


131699 


R68657 


Hs30421 ESTs; Modtysmtr toll ALU SUBFAMILY SX WARNING ENTRY U [H.sapiens] 


5279 


104490 


N71503 


Hs.43087 ESTs; Weakly similar to dysfarlln [Rsaplens] 


5266 . 


102124 


U14528 


Hs29981 solute carrier family 26 (sulfate transporter); mambar 2 


5.151 


109280 


AA196635 


Hs56081 ESTs 


5.134 


109707 


FD9739 


Hs.185701 Homo sapiens mRNA full length insert cONA done EUROtMAGE 21920 


5.075 


108087 


AA045709 


Hs.40545 ESTs 


5.075 


135006 


M21665 


Hs.929 myosin; heavy polypeptide 7; cardiac muscle; beta 


5.055 


119182 


R80664 


Hs.77067 ESTs 


5.033 


129806 


R62444 


Hs.173373 WAA0931 protein 


4.675 


101435 


M20543 


Hs.1 288 actai; alpha 1; skeletal muscle 


4.626 


125954 


R93943 


y(72c12j1 Scares reflna N2b4HR Homo sapiens cONA done IMAGE275735 5', 


4.6 


113989 


W87544 


Hs221184 ESTs 


4559 


104432 


J03460 


Hs.99949 prolactin-induced protein 


4.451 


112326 


R56068 


Hs.4268 ESTs 


4.45 


119063 


R16833 


Hs£3106 ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY II [Rsaplens] 


4.45 


130376 


R40873 


Hs.1 55174 WAA0432 gene product 


4201 


122484 


AA448286 


Hs.98074 ESTs; Highly similar to atrophin-1 interacting protein 4 [Rsaplens] 


42 


104142 


AA447006 


ESTs; Moderately similar to 0 ALU SUBFAMILY SQ WARNING 


4.175 


129413 


N32787 


Hs.1 11 23 ESTs; Moderately similar to hypothetical protein 2 [Rsaplens] 


4.1 


103678 


Z84483 


Human DNA sequence from PAC 46H23, BRCA2 gene region chromosome 13q12-134.05 


114266 


Z40186 


Hs26409 ESTs 


4.05 


115206 


AA262491 


Hs.186572 ESTs 


4.048 


123723 


AA609749 


Hs.1 12759 ESTs; Highly similar to unknown protein [R.norvegicus] 


4.041 


129130 


K97993 


Hs.172788 EST s; Weakly similar to K1AA0512 protein [Rsaplens] 


4.028 
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20217 
08536 
34460 
20418 
32783 
25052 
08600 
03099 
34948 
20511 
11861 
13966 
31649 
29775 
10191 
12678 
127115 
32892 
15023 
14932 



H&66035 ESTs 



34480 
24780 
30631 
34154 
04160 

05524 
10168 
09480 
09585 
15134 
16083 
20524 
16932 
30746 
07513 
18641 
126584 
05134 
23502 



05691 
31505 
20775 
05579 
28190 
00819 
30217 
30068 
34719 
10277 
27354 
29173 
27464 
24923 
22465 
122027 
03329 
29937 
34197 
07764 
21775 
14768 
32381 
23105 
21176 
25053 
05909 



Z41078 
AA084S24 

AA4O0030 Hs-8380 

AA236010 H&26613 

N74897 H&5683 

TB0174 H&222779 

AA099585 Hs.41175 

X61100 H&8248 

H06773 Hs53S50 

AA258144 H&221576 

R37460 H&25231 

W86600 H&S842 

AA481254 H&30120 

R94659 Hs.12420 

H2056B H&27182 

R87160 H&33665 

AA375791 Hs.131894 

W92797 H&59378 

AA252079 K&63931 

AA242751 Hs.16218 

AA487228 Hs.19479 

AA024664 H&83916 

R42493 H5220839 

AA025399 Hs.169737 

AA211320 Hs.79404 

AA455706 Hs.99722 

AA258158 H&22153 

H19673 Hs.176586 

AA233299 Hs.72158 

F02367 Hs27252 

AA257107 Hs.194331 

AA455653 Hs.44581 

AA261852 Hs.192905 

H74330 Hs.150000 

AA256976 Hs.18800 

X05451 Hs.158295 

N70298 Hs.49829 

AI028384 Hs.127331 

AA159953 H&22895 

AA600116 Hs.112526 

N50866 Hs.47135 

AA287097 Hs.75356 

H85897 H&27755 

AA342104 Hs56777 

AA278824 Hs.19218 

AA946876 Hs.148376 
HG402OHT4290 

D29956 Hs.152818 



zn19d8 .si Stratagens neuroepithattum NT2RAMI 937234 Homo sapiens cONA 
ESTs; Weakly similar to B ALU CLASS B WARNING ENTRY I! [Rsaplens] 
Homo sapiens mRNA; cONA DKFZp586F1323 (from clone DKFZp586F1323) 
DEAD/H (Asp-eau-AteeAsp/Hb) box polypeptide 15 



4j028 
4.023 
3525 
351 



ESTs 

NAOH dehydrogenase (ubiquinone) Fe-S protein 1 (75kD) (MADH-coenzyme 

protein kinase; AMP-acKvated; gamma 2 non-catalytic subunit 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

phosphoHpase A2-ecGvafing protein 

ESTs 

ESTs 

DKFZP434Q162 protein 
dachshund (Drosophila) homolog 
K1AA0903 protein 
ESTs 

NAOH dehydrogenase (ubiquinone) 1 alpha subcomplex; 5 (13kD; B13) . 

ESTs 

ESTs 

neuron-specific protein 

ESTs; Weakly similar to 78 KD GLUCOSE REGULATED PROTEIN 
PRECURSOR 

ESTs; Weakly similar to KIAA0352 tH.sapiens] 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Weakly similar to HEAT SHOCK 70 KD PROTEIN 6 [tisapiens] 

ESTs 

ESTs 

ESTs; Weakly similar to K1AA0579 protein [H.saplans] 
Human alkali myosin light chain 3 mRNA; complete cds 
ESTs 
ESTs 

ESTs; Weakly similar to arybultatase B precursor pisaplens] 

ESTs 

ESTs 



ESTs 
EST 
ESTs 
ESTs 
Transglutaminase 



AA608903 

L07515 

H29209 

AM188B0 

R60523 

AA970504 

R94500 

AA448164 

AA431302 

X85134 

M95767 

AA057341 

AA018219 

AA421773 

AA149007 

N48818 

AA485973 

AA400080 

T80620 

AA401739 



Hs.106220 

H&89232 

Hs.151231 

Hs.185797 

Hs.109087 

Hs.146103 

Hs.108046 

Hs.99153 

Ha98721 

Hs.72984 

Hs.135578 

H&87889 

H&226923 

Hs.161008 

Hs.182339 

Hs.46884 

Hs.143947 

Hs.97774 

Hs.186473 

Hs5111 



ass 

1833 

3518 

3.792 

1779 

3.768 

3.75 

3.708 

3.707 

a7 

a7 

a674 
a653 
a625 

a&2 

3.614 

a6i3 
a6 

3592 
3568 

1559 
3542 
3525 
3522 
35 
35 
3.459 
3.45 
3425 
3.42 
3417 
3.407 
3.399 
3.325 
3518 
3317 
3515 
3509 
35 
3295 
3292 
3288 
3273 
3269 
3266 



KIAA0336 gene product 
chromobox nomolog 5 (Drosophila HP1 alpha) 
ESTs; Highly similar to FYVE taiger-containing phosphoinositide kinase [M jnusculus] 326 
ESTs * 3212 



ESTs 
ESTs 
ESTs 

ESTs; Highly similar to CGI-73 protein [H^aplens] 

EST; Weakly similar to N-coplne {H^aplens] 

refinoblastoma-binding protein 5 

chitoblasa; di-N-acetyl- 

heficase-mol 

ESTs 

ESTs 

Els homologous {actor 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 



ai97 
ai79 
ai75 
3.151 

aisi 

3.15 

3.15 

115 

3.125 

3.125 

ai2 

an 

aio4 

ai 

ao75 

3.066 
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119767 W72562 

115776 AA424038 

111713 R22988 

115301 AA280047 
5 118448 N66412 

106586 AA4S6598 

110415 H48239 

105173 AA182030 

101102 IX7594 
10 110543 H58383 

1255S3 R24464 

100824 HQ4058-HT4328 

106822 AA481068 HsX1835 

131983 D11930 
IS 111221 N68869 

113620 T83785 

105220 AA210695 

123234 AA490227 

125250 W87465 
20 116196 AA465160 

122100 AA432243 

111712 R22905 

126589 W78107 

111132 N64378 
25 115307 AA280300 

108989 AA152263 

129486 H03686 

119805 W73788 

125721 R598S1 
30 103704 AA028171 

128420 AI088155 

120571 AA280738 

.123059 M482019 

129462 D84239 
35 125166 W45491 

125992 WD1626 

109431 AA227972 

105077 AA142919 

131388 R34531 
40 121080 AA39B720 

112575 R73816 

130244 R26206 

134698 AA427783 

116355 AA504356 
45 115316 AA280627 

129677 U48736 

130971 H20332 

115054 AA252863 

130285 AA063546 
50 124308 H93575 

125502 AA732329 

114800 AA159825 

128625 AA242816 

130159 H51098 
55 107127 AAS20504 

113547 T90746 

104639 AA004622 

127609 AA622559 

106922 AA4909S4 
60 124825 R52088 

65 



124333 H98683 

117634 N36421 

101609 M54927 

117142 H96908 

112602 R79147 

106328 M481505 

124377 N25996 



H&58119 ESTs 
H&58197 ESTs 
H&220950 ESTs 
HSA3948 ESTs 
H&49189 ESTs 
HsX56269 ESTs 

Hi29739 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-3A [Helens] 
HsX364 ESTs 

Hs.79059 transforming growfti factor, beta receptor III (betaglycan; 300kD) 
H&258544 ESTs 
Hs202949 WAA1 102 protein 

Oncogene Aml1-B^1, Fusion Activated 
ESTs 

HsX592 ESTs 
Hs.15119 ESTs 
Hs.17252 EST 
Hs.17212 ESTs 
Hs.105252 ESTs 

H&222926 ESTs; Weakly similar to D20922 [Celegans] 
HsX33B6 ESTs 

Hs.41086 ESTs; Weakly similar to OXYSTEROL-BINDING PROTEIN [H.saplens] 
Hs.1 13716 ESTs 

Hs.187698 ESTs; Weakly similar to Yer140wp [S-cerevisiae] 
Hs.1 31 49 ESTs; Highly similar to unknown funcfion [H^aptens] 
Hs.191346 ESTs 
Hs.1 8827 KIAA0849 protein 

H&220689 Ra&GTPase-activatiag protein SH3Ktornaln-t3inding protein 
Hs.43213 ESTs 
Hs.7503 ESTs 
Hs.153688 ESTs 

Hs.14146 ESTs; Weakly similar to unknown [Rsapiens] 
Hs.128679 ESTs 
HsX38202 EST 

Hs.1 11732 IgQFc binding protein 
Hs.1 72609 nudeobindin 1 

za36e07.rt Soares fetal Dver spleen 1NFLS Homo sapiens cONA clone 
Hs43635 ESTs 
Hs£558 ESTs 

HSX2200 KIAA0480 gene product 
Hs.177953 ESTs 
Hs.17385 ESTs 
Hs.153293 WAA0701 protein 

Hs.77910 34iydroxy-3-rnethylglutaryt-Coenzym8 A synthase 1 (soluble) 
HsX8650 ESTs 
HSX7846 ESTs 

Hs.198891 serlne/mreoolne-protein Wnasa PRP4 homotog 
H&28707 signal sequence receptor, gamma (translocon-assodated protein gamma) 
HSX7729 ESTs 
H&202968 ESTs 

Hs£27146 Homo sapiens mRNA; cDNA DKFZp564J142 (from done OKFZp564J142) 
Hs.191959 ESTs 

Hs.131887 ESTs; Weakly similar to ORF YNL227c [Sxerevislae] 
Hs.102652 ESTs; Weakly similar to K1AA0437 [Rsapians] 
Hs.151310 PDZ domain protein prosophfalnaD-like) 
H&22119 ESTs 
Hs.15233 ESTs 
Hs.18214 ESTs 
Hs.150318 ESTs 
Hs.10056 ESTs 

yg85c3.s1 Soares Infant brain 1MB Homo sapiens cDNA done 
Hs.154054 ESTs 

Hs.107854 ESTs; Weakly slmJar to SODIUM- AND CHLORIDE-DEPENDENTGLYCtNE 
TRANSP 

Hs.1787 proteoiipid protein 1 (Pelizaeus-Merzbacher disease; spastic paraplegia 2; 
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H&42251 ESTs 
H&203365 ESTs 
Hs.13797 ESTs 
Hs.179833 ESTs 
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101026 J04970 earboxypepfidase M 2575 

124560 N66393 ' Hs.102754 ESTs • 2575 

124066 H02494 Hs.101615 ESTs 2571 

130281 R12777. Hs.15395 ESTs; Weakly similar to ARQINYL-TRNA SYNTHETASE [Usapiens] 2.66 

110349 N49602 Hs.13308 ESTs 255 

111031 N54839 Hs221085 ESTs; Highly slnflar to mediator (tisapiens] 2533 

121770 AA421714 Hs.11469 K1AA0898 protein 2.63 

134132 U32519 Hs220689 Ras-GTPase-ectivaBng protein SH3Ktoma!rt-b!railng protein 2526 

112424 R62452 Hs.191265 ESTs 2525 

122544 AA451679 Hs.194410 ESTs 2525 

134425 X90568 Hs.172004 tin 2.624 

111114 N63391 Hs5238 ESTs £619 

116119 AA459242 Hs44445 ESTs; Weakly similar to Kelch motil containing protein [H.S3piens] 2515 

112079 R44164 Hs23014 ESTs 25 

123033 AA481271 Hs.193945 ESTs 2591 

124196 H52617 Hs.144167 ESTs 2586 

125873 H14437 yl25a04.r1 Soaras breast 3NbHBst Homo sapiens cDNA dona 258 

117684 N40184 Hs45050 ESTs 2575 

134938 D30037 Hs.168326 ptosphotUylinosltol transfer protein; beta 2575 

131822 AA215647 HS200332 ESTs 2568 

135185 U71203 Hs56038 Rlc (DrosophllaHike; expressed In many tissues 2564 

117690 N40467 H&93834 ESTs . 2557 

118807 N78582 Hs50732 protaln kinase; AMP-activated; bete 2 norwatalyfic subunit ~ 2552 

121369 AA405657 Hs.128791 Human DNAsequence from clone 967N21 on ehrarnosome20p125-13. Contains 255 

114860 AA235112 Hs.106227 ESTs; Moderately similar to similar to murine RNA-bindirtg protein [H.sapiens] 2549 

121857 AA426017 Hs.62694 ESTs; Highly similar to DNA-REPAIR PROTEIN COMPLEMENTING 2548 

110190 H20560 H&244624 ESTs 2548 

132573 AA045333 Hs51743 ESTs; Weakly similar to I! ALU SUBFAMILY S62 WARNING ENTRY 11 {H^aplens] 2542 

109706 F09729 Hs.12780 ESTs 2537 

135109 AA4103S1 Hs.94592 ktofiio 2525 

132810 R37027 Hs5737 KIAA0475 gene product 2525 

124879 R73588 Hs.101533 ESTs 2525 

103840 AA174190 Hs50932 ESTs 2525 

119066 R22196 Hs.34492 ESTs 2519 

114833 AA234362 Hs57310 ESTs; Moderately similar to CGI-66 protein [H.sapiens] 2507 

112993 T23555 Hs.103288 ESTs 25 

123312 AA498258 Hs.99601 ESTs 2499 

121873 AA426270 Hs.145696 splicing factor (CC1 5) 2.491 

123321 AA496884 Hs23972 ESTs 2491 

107760 AA018042 Hs.95078 EST 2483 

102580 U60808 Hs.152981 CDP-dacylglycerol synthase (phosphaBdate cyBdylyttransfBrasa) 1 2481 

103053 X56741 Hs5947 meltrartsferming oncogene (derived from ceH line NK14)-RAB8homolog 2475 

124756 R38100 Hs.106294 ESTs 2475 

112936 T15665 Hs5185 ESTs; Weakly similar to BCDNA.GH12174 [Djnelanogaster] 2475 

125178 W58202 Hs.125731 ESTs 2.475 

112423 R62447 Hs22123 ESTs 2.471 

123515 AA600323 Hs.1 12535 EST 2462 

102842 U95020 Hs21903 calcium channel; voltage-dependent; beta 4 subunit 2457 

102400 U42390 Hs.171957 triple functional domain (PTPRF interacting) 2455 

113187 T56056 Hs.9992 ESTs * 2452 

131687 L11066 Hs5069 heat shock 70kD protein 9B (mortaIin-2) 2.448 

115314 AA280583 Hs.256501 ESTs 2437 

128211 AI206427 Hs.166707 ESTs; Highly similar to Ran-binding protein 2 [Rsapiens] 243 

134281 L1 1005 Hs.81047 aldehyde oxidase 1 2425 

115985 AA447709 Hs.132094 ESTs; Moderately similar to putative transcription factor CA150 [H.saplens] 2.425 

111348 N90041 Hs.9585 ESTs 2.418 

129430 AA258842 Hs.1 97877 Homo sapiens clone 23777 putative transmembrane GTPase mRNA; partial cds 2418 

133863 C13990 Hs.76930 synuclein;akma(nonM component of amyloid precursor) 2417 

111164 N66857 Hs.14808 ESTs; Weakly similar to B ALU CLASS C WARNING ENTRY II (H-sapiens) 2.416 

132143 AA257056 Hs.7972 KIAA0871 protein 2.412 

130330 M55047 Hs.154679 synaptotagmln 1 2408 

114219 Z39451 Ks27389 ESTs 2406 

117101 H94043 Hs.24341 OKFZP58611419 protein 2403 

125433 AA034325 HS54320 ESTs 2.4 

111099 N62506 Hs21958 ESTs 2.4 

120323 AA195405 Hs.1 10347 Homo sapiens mRNA for alpha Integrin binding protein 80; partial 2597 

118624 N69998 Hs21801 ESTs 2594 

123570 AA608955 Hs.109653 ESTs 2589 

123562 AA608893 Hs.190065 ESTs 2588 
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131546 AA262821 H&28578 musclablind prosophb>flte 2-385 

103143 X66141 Hs.75535 myosin; fight polypeptide 2; regulatory; cardiac; stow 2584 

123645 AA609310 Hs.188691 ESTs 2583 

130123 AA001635 Hs.150390 zinc finger protein 262 2579 

131682 AA428368 K&30654 ESTs 2578 

115909 AA436666 H&59761 ESTs 2575 

125168 W45574 H&252497 ESTs 2572 

123973 C14805 Hs.182151 ESTs 2561 

135197 U76456 Homo sapiens tissue Inhibitor of metaBoprotelnase 4 mRNA, complete cds 2557 

118689 N71545 Hs.184544 ESTs 2557 

107734 AA016225 Hs£33B6 ESTs £354 

124590 N69220 H&41381 ESTs; Weakly similar to ublqufiin rr/drolyztng enzyme I [H.sap!8ns] 255 

111163 N66850 Hs.17606 ESTs 2548 

112349 R58877 Hs22665 ESTs; Moderately similar to dJ83L6.1 fRsapiens] 2545 

129076 AA262179 Ks.169343 ESTs 2545 

134238 R81509 Hs.184571 spiking factor; argirdna/serins-rich 1 1 2541 

116766 H13260 HSJ95097 ESTs 2536 

106331 AA436853 HS54795 ESTs 2533 

129003 AA443752 Hs.10784 ESTs 2532 

132368 AA599814 Hs.46637 ESTs; Weakly similar to cDNA EST yk289g55 comes from this gene [C.etegans] 2532 

124697 R06273 Hs.186467 ESTs; Modly smlr to II ALU SUBFAMILY J WARNING ENTRY I! [Rsaplens] 2522 

120273 AA176688 H&221139 ESTs . 2513 

127110 AA304993 Hs.100861 ESTs; WeaHy similar to p60 katanln [H^apiens] ~ 2507 

105450 AA2S2621 HS53B42 ESTs 2501 

119819 W74371 Hs58383 ESTs 2297 

102302 U33052 Hs.69171 protein kinase Ofto 2 2288 

130598 N74353 Hs.16475 ESTs 2282 

114161 Z3B904 Hs22385 ESTs; Weakly similar to KIAA097Q protein [Rsapiensj 2278 

130542 U64675 Human sperm merrfcrane protein BS-63 mRNA, complete cds 2277 

104491 N71513 HS59328 ESTs 2275 

116988 H82527 ys69e12.s1 Scares retina N2b4HR Homo sapiens cDNA clone 2275 

126823 AA370120 Hs.7870 ESTs; Weakly similar to Ytr350wp [S.cerevlsiae] 2273 

108800 AA129731 Hs.90424 ESTs 2273 

101310 L41607 HSJ934 fltucosamlnyl (N-acetyl) transferase 2; Kbranch'mg enzyme 2269 

126842 W19498 HS21085 ESTs 2255 

127251 AA936428 Hs.128638 ESTs 2251 

124647 N91947 Hs.125033 ESTs 2249 

127112 AI143906 Hs.125103 ESTs 2247 

101973 S82597 Hs.80120 UDP4^tyI-alpha-D^alactosamirte^lypepfkle 2246 

120999 AA398302 Hs.127437 ESTs 2245 

130225 AA599583 Hs.15299 HMBA-indudble 2243 

119980 W88678 Hs249247 heterogeneous nudaar protein similar to rat helix destabilizing protein 2243 

124222 H61053 Hs222844 ESTs 224 

129199 H90914 Hs.128629 ESTs 2236 

106802 AA479101 Hs.16570 ESTs; Weakly similar to ft ALU SUBFAMILY SQ WARNING ENTRY B [H.sapiens] 2231 

126160 N90960 Hs247277 ESTs; WeaWy similar to transformation-related protein [H-saplens] 2229 

104627 AA001976 Hs.19603 ESTs 2228 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cONA DKFZp564C053 (from clone DKFZp564C053) 2226 

113096 T40927 H&8345 ESTs 2225 

135336 AA452822 Hs.99027 ESTs 2225 

135344 R62976 Hs.168491 ESTs; Moderately similar to TRFI-hteradhgankyrin-reiatBd 2225 

126156 AA508354 Hs.1 18448 ESTs; Moderately similar to AKT3 proteto kinase [H^apiens] 2222 

128885 AA397841 Hs.180141 cofilin 2 (muscle) - 2218 

107900 AA026385 Hs.176600 ESTs; Moderately similar to I! ALU SUBFAM1Y SB2 WARNING 2217 

114481 AA033562 Hs.1 51 572 ESTs 2212 

109292 AA199828 Hs.188662 ESTs 2212 

104257 AF006265 Hs.9222 estrogen receptor-binding fragment-associated gene 9 2209 

132932 T15482 Hs.6093 ESTs 2204 

127392 AA262728 Hs.14896 Homo sapiens done 24590 mRNA sequence 2204 

104641 AA004652 Hs.18564 ESTs 22 

122529 AA449828 Hs.99229 ESTs 2.195 

124307 H93562 Hs.162395 proline synthetase co-transcribed (bacterial homolog) 2.193 

133601 S95938 Hs.75155 transferrin 2.193 

119904 W85709 Hs.128927 ESTs; Weakly similar to D ALU SUBFAMILY SP WARNING ENTRY D [Rsaplens] 2.192 

100343 D64109 H&4994 transducer of ERBB2; 2 (TOB2) 2.185 

126871 AA351779 Hs200334 ESTs 2.18 

127793 AE98835 H&30445 ESTs; Weakly similar to transcription regulator Staf-50 [Rsaplens] 2.178 

105149 AA169253 H&8958 ESTs £177 
121367 AA405648 zw39g3^1 Soares_totaLfetus_Nb2HF8_9w H sapiens cDNA clone IMAGE772478 2.177 
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111836 R36228 Hs25119 ESTs 2.175 

133334 R16759 H&237225 rtbosomal protein S5 pseudogena 1 2.175 

123207 AA489697 Hs.145053 ESTs 2.175 

129801 F11087 H&239668 ESTs 2.175 

5 103393 X94612 Hs.41749 protein kinase; cGNfP-dependent; type U 2.161 

132415 AA043223 Hs4815 rtucDx (ruidaoside diphosphate Inked moiety X)-type mcta 3 2.157 

106369 AA443828 HS25324 ESTs 2.157 

122963 AA478446 Hsj69559 KIAA1096 protein 2.156 

133473 M19309 Hs.73980 troponin T1 ; skeletal; stow 2.155 

10 134257 C06270 HsH078 Homo sapiens mRNA; cONA DKFZp586L081 {from dona DKFZp586L081) 2.155 

135156 AA056012 H&9552 binder of Ail Two 2.151 

104055 AA393755 Hs.117211 ESTs; Highly similar to C6M2 protein [Rsaptens] 2.15 

102313 U33921 HSU33921 Oontech adult lung cONA library (HL1 158a) Homo sapiens cDNA 2.15 

109788 F10638 Hs.12432 Homo sapiens done 24407 mRNA sequence 2.15 

IS 103507 Y10O32 Hs.159840 serum/ghicocorticold regulated kinase 2.15 

116000 AA448710 Hs.41327 ESTs 2.15 

105858 AA399164 Hs227676 ESTs; Moderately similar to U ALU SUBFAMILY SQ 2.137 

103153 X66534 Hs.75285 guanytate cyclase 1; soluble; alpha 3 2.137 

126202 AA652238 Hs.189726 ESTs 2.135 

20 115955 AA446121 H&44198 Homo sapiens BAG clone RG054D04 from 7q31 2.134 

104164 AA458770 Hs27023 K1AA0917 protein 2.132 

108692 AA121270 H&82960 ESTs ^ 2.128 

122878 AA465341 HSS9640 ESTs 2.126 

134771 L13939 Hs.89576 . adaptor-related protein complex 1 ; beta 1 subunit 2.125 

25 104298 D31120 Hsj40368 adaptor-related protein complex 1 ; sigma 2 subunit 2.125 

104840 AA039595 Hs.42458 Homo sapiens mRNA; cDNA 0KFZp586C1817 (from done DKFZp586C1817) 2.125 

122180 AA435798 H&98835 EST s; Moderately similar to putative ring zinc finger protein 2.125 

131012 H01992 Hs202849 KIAA1 102 protein 2.125 

134092 H17490 Hs.7905 ESTs; Highly similar to sorfingneidn 9 [Rsapiens] 2.123 

30 118617 N69666 Hs.183413 ESTs; ModtJy smtr to U ALU SUBFAMILY J WARNING ENTRY II [H^apiens] 2.123 

107155 AA6212Q2 Hs.7946 DKFZP586D1519 protein 2.12 

130925 N71935 Hs.169378 mufDpbPOZ domain protein * 2.12 

135167 U63717 HsS5B21 osteoclast stimulating factor 1 2.118 

105952 AA405263 Hs.181400 ESTs 2.109 

35 110308 H38148 HsJ2775 ESTs 2.108 

116368 AA521186 HSS4217 ESTs 2.107 

132939 U76189 Hs.61152 exostoses (multiple}-tta 2 2.102 

117881 N50073 HsH4926 ESTs; Highly similar to EWND1 protein [M.musculus] 2.1 

121723 AA419622 Hs.104800 ESTs; Weakly similar to Mouse 195 mRNA; complete ods [M jnusculus] 2.096 

40 103500 Y09443 H&22580 alkytglycerone phosphate synthase 2X194 

121429 AA406293 Hs.193498 ESTs 2.093 

134632 AA398710 Hs.174139 chloride channel 3 2.091 

129785 F10980 Hs.184760 ESTs 2.09 

111065 N58193 Hs.18740 ESTs; Weakfy similar to 1-evidence 2.089 

45 114710 AA129931 Hs.79081 protein phosphatase 1 ; catalySc subunit; gamma Isoform 2.083 

132711 N73702 H&238927 ESTs 2.083 

133377 R05490 Hs.7239 SEC24 (S. cerevisiae) related gene family; member B 2.079 

124773 R40923 Hs.106604 ESTs 2X178 

117759 N47587 Hs57345 ESTs; Weakly similar to TROPOMOOUUN [Rsapiens] 2.076 

50 127386 AI457411 Hs.106728 ESTs 2X176 

101167 L15309 Hs.193677 zinc finger protein 141 (clone pHZ-44) 2X175 

109597 F02582 Hs.14474 ESTs 2.074 

124390 N29325 Hs.7535 ESTs; Highly similar to COBW-Hke placental protein (Hsapiens] 2.07 

116225 AA478609 Hs.47278 Human Chromosome 1 6 BAG done CIT987SK-A-735G6 2X17 

55 131243 R16667 H&24752 spectrin SH3 domain binding protein 1 2X169 

130557 T90830 Hs.15981 ESTs; Weakly similar to Bne-1 protein ORF2 (H sapiens] 2X367 

134103 D14826 Hs. 155824 cAMP responsive element modulator 2X164 

108833 AA131866 Hsj61661 ESTs; Weakly slmter to DY3Xs [Cetegans] 2X)63 

112286 R53765 Hs.158135 KIAA0981 protein 2.063 

60 125624 AA165411 zq49a01.fi Stratagene hNT neuron (#937233) Homo sapiens cDNA done 2X161 

124612 N72200 Hs.13913 ESTs 2X158 

116335 AA495830 HsX17013 ESTs 2X67 

112248 R51361 Hs23423 ESTs 2X156 

115789 AA424754 Hs.43149 ESTs 2.056 

65 107029 AA599219 Hs.187492 ESTs; Weakly similar to ALR (Rsapiens] 2X156 , 

110294 H30270 Hs.165062 ESTs 2.054 

120532 AA262354 Hs.186648 ESTs 2-054 

118180 N59249 Hs.48349 ESTs 2.052 

132018 AA293194 Hs.3737 ESTs 2XS2 
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132617 AA171913 
131528 N38167 
113254 T64438 
122785 AA459978 
107203 D20426 
105713 AA291321 
129385 D82675 
119116 R43845 
116405 AA600253 
125324 AA526849 
105599 AA279442 
119741 W702O5 
101449 M21494 
107109 AA609943 
117040 H89112 
132906 AA142857 
105479 AA255546 
102031 U04898 
119846 W803B3 
124809 R46482 
130286 AA041548 
124457 N50114 
125144 W37999 
120581 AA281257 
104931 AA062731 
120548 AA278846 
113933 VU81362 
123072 AA485041 
123648 AA609323 
116875 H57749 
103179 X69398 
103478 Y07755 
111007 N53378 
120470 AA251797 
112280 R53457 
114127 Z38652 
129863 AA151O05 
106320 AA436608 
108933 AA147224 
105906 AA401633 
109029 AA157911 
118470 N66769 
115353 AA281886 
115257 AA279060 
126879 M719776 
109547 F01479 
127111 AA805726 
101266 L36645 
129319 AA037467 
106211 AA428240 
112753 R93696 
120489 AA255538 
129699 AA458576 
105425 AA251129 
134740 L37362 
109324 AA210700 
124303 H93043 
102337 U36922 
109441 AA228100 
127364 AA179573 
105255 AA227498 
130672 L19783 
104301 D45332 
132442 R62589 
105519 AA258063 
132902 AA490969 
118873 N89881 
114124 Z38595 
115075 AA25548S 



H&5338 
K&28274 
Hs.11449 
K&9950S 
H&5G56 
Hs.184319 
Hs.110950 
Hs*4595 
HS55601 
HS&109 
H&143460 
H&43670 
Hs.118843 
H&32793 

H&234896 
H&23467 
H&2156 
H&58448 
Hs.106875 
Hs.154023 
Hs.128704 
Hs£4336 
Hs.125868 
Hs.108319 
Hs.187634 
H&30567 
Hs.104308 
Hs.1 12689 
Hs.161022 
H&82685 
H&38991 
Ha22543 

HSJ2604O 
Hs.106961 
Hs.129872 

Hs.71814 
H&22380 
Hs.72200 
H&82781 
H&88923 
Hs.193516 

' Hs.26966 
Hs220509 
H&73964 
H&30340 
Hs.126083 
Hs.169882 
Hs.190504 
Hs.12017 
H&24416 
HsX9455 
Hs.86405 
Hs.107070 

Hs.86998 

HsX0061 

H&3623 

Hs.177 

Hs.6783 

Hs.167419 

H&23438 

Hs.168147 

Hs.44577 

Hs.125019 

Hs£8045 



caroorfcanhydraseXII 
ESTs 

OKFZP5640123 protein 
ESTs 
EST 



Homo sapiens clona 25007 mRNA sequence 
DKFZP566E2346 protein 



1 

protein kinase dnu 
Unestn family member 3A 
creatine kinase; musete 
ESTs 

y«25©5.s1 Morton Fetal Cochlea Homo sapiens cDNA done IMAGE25328 
ESTs; Highly similar to gemin'm [H^aplens] 
ESTs 

RAR-related orphan receptor A 
ESTs 
ESTs 

KIAA0573 protein 
ESTs 
ESTs 
ESTs 

thyroid hormone receptor-associated protein; 150 kDa subunit 
ESTs 
ESTs 
ESTs 
ESTs 
EST 

CD47 antigen (Rh-ralated antigen; integrln-associated signal transducer) 
S100 catciurn-binding protein A2 
ESTs 

zs11I3j1 NCLCGAPJ3CB1 Homo sapiens cDNAdone 
ESTs; Wealdy similar to tatty acid omega-hydroxylase [H^aplens] 
ESTs; Wealdy similar to TYL [tisapiens] 
sperm surface protein 
ESTs 
ESTs 
ESTs 
ESTs 
ESTs 
ESTs 

B-ceH Ctl/lymphoma 10 



2j05 

2x5 

2X5 

2j05 

2j05 

2X46 

2X42 

2X4 

2X4 

2X39 

2X37 

2X37 

2X36 

2X34 

2X34 

2X31 

2X27 

2X27 

2X24 

2X24 

2X23 

2X17 

2X17 

2X14 

2X12 

2X11 

2.011 

2.009 

2.008 

2.003 

1X95 

1.995 

1.995 



1X39 
1.988 



1X88 
1286 
1X82 
1X82 
1X75 
1.975 
1X74 



Zh38g04.s1 SoaresjAieaLfltand_N3HPG Homo sapiens cDNA done IMAGE41 4390 1X74 



ESTs 
ESTs 
EphA4 
ESTs 
ESTs 
ESTs 
ESTs 

K1AA0439 protein; homolog of yeast ubiquifbvproteln Bgase Rsp5 
ESTs 

opioid receptor; kappa 1 

Homo sapiens mRNA; cONA DKFZp564P056 (from done DKFZp564P056) 
ESTs 

Human fork head domain protein (FKHR) mRNA, 3 end 
nuclear factor of activated T-ceDs 5 



ESTs 

phosphatidyDnosttol glycan; class H 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Highly similar to WAA0886 protein [H^apiens] 
ESTs 
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1X73 
1.969 
1566 
1X65 
1X62 
1X61 
1X59 
1.956 
1.953 
1.95 
1.95 
1X5 
1X48 
1X46 
1.942 
1X42 
1X42 
1X4 
1X39 
1X37 
1X36 
1X36 
1X34 
1X33 
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110895 H33483 Hs.124777 ESTs 1531 

105380 AA23S209 Hs.187628 ESTs 1531 

124998 T56013 Hs.77910 3-hydroxy-3-rnsthylglutaryl-Coenzyme A synthase 1 (soluble) 1529 

121816 AA424814 Hs.187509 ESTs 1527 

111717 R23241 Hs.1 10776 STAT induced STAT lnhH>itor-2 1525 

128874 H06245 Hs.106801 ESTs 1525 

109391 AA219699 Hs.184245 KIAA0929 protein Msx2 intaracfing rruciaar target (MINT) hornobg 1513 

126129 H82165 Hs.40334 ESTs 1511 

115553 AA369027 Hs.71414 ESTs 1505 

113811 W44928 Hs.4878 ESTs 1505 

108345 AA0709O5 zm66d1.s1 Stratagene neuroepirhelium ($937231) Homo sapiens cONA done 1504 

120472 AA2S1875 Hs.104472 ESTs; Weakly similar to Gag-Pol polyprotebi [Mjnuscutus] ■ 1503 

116602 080063 H&241673 EST 1501 

121121 AA399371 Hs.189095 ESTs; Weakly similar to zinc finger protein SALL1 (Rsapiens] 15 

125330 AA401804 Hs.1 14574 ESTs 1596 

130095 F01831 Hs.14838 ESTs 1594 

119782 W72982 H&58262 ESTs 1594 

104115 AA428090 H&26102 ESTs 1593 

131313 C17838 Hs22370 Homo sapiens mRNA; cONA DKFZp564O0122 (from dona DKFZp564O0122) 1-891 

105583 AA278907 Hs54549 ESTs 1591 

122825 AA461195 HS59580 ESTs 1587 

119495 W35390 H&55533 ESTs 1586 

130309 AA134289 Hs.15423 Homo sapiens BAG done RQ114B19 from 7q31.1 1J886 

125628 AA418069 H&241493 natural Mer-tumor recognition sequence 1-886 

110611 H66947 Hs.14871 ESTs; Highly similar to gene ERCC5 protein [Usaplans] 1-885 

117301 N22569 Hs-43215 ESTs 1584 

131406 N92239 H&26471 Wnt inhibitory (actor-1 1-881 

126428 AA013312 Hs-64988 ESTs 1581 

120285 AA1 82882 Hs.111110 Kin-cap (telethonln) 1578 

112724 R91753 Hs.17757 ESTs 1578 

103121 X63679 Hs-4147 translocating chain-associating membrane protein 1.875 

124381 N26765 Hs.109008 ESTs 1575 

117226 N20468 Hs.177322 ESTs; Weakly similar to putative p150 (lisapiens] 1.875 

105610 AA279991 Hs.124691 ESTs; Weakly similar to trilhorax homologue 2 [H.saplens] 1.875 

111229 N69113 Hs.1 10855 ESTs 1575 

120627 AA285079 Hs.190474 ESTs 1573 

107048 AA600012 Hs.10669 ESTs; Moderately similar to KIAA0400 {H sapiens] 1572 

104041 AA381902 Hs.197114 RNA binding protein 1572 

115162 AA258366 H&227806 ras GTPasa activafing proteMke 1572 

102239 U26726 Hs.1376 rrydroxysteroW(11-beta) dehydrogenase 2 1.87 

100043 M10098 AFFX control 18S ribosomal RNA 1568 

120296 AA191353 H&22385 ESTs; Weakly similar to WAA0970 protein [lisapiens] 1.867 

129011 S72869 Hs.107932 DNA segment; single copy; probe pH4 (transforming sequence; thyroid-1 ; 1.867 

134851 R44479 Hs50232 K1AA0552 gene product 1.866 

117392 N26175 Hs.93405 ESTs 1564 

114530 AA053027 Hs.191797 ESTs 1563 

123541 AA608794 Hs.1 12592 ESTs 1563 

124890 R76618 Hs34145 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB>8 [H sapiens] 1562 

105299 AA233511 Hs.194720 ATP-binding cassette; sub-tamily G (WHITE); member 2 1.861 

103560 Z20656 Hs.1 82787 myosin; heavy polypept 6; cardiac muscle; alpha (cardiomyopathy; hypertrophic 1 ) 1.861 

113073 T33637 Hs.6841 ESTs 156 

120407 AA235040 Hs.107283 ESTs 1559 

103892 AA243523 Hs.17155 ESTs - 1558 

123795 AA620381 Hs.70488 ESTs 1557 

108524 AA084323 Hs£8138 ESTs 1557 

113953 W85812 Hs.187554 ESTs 1.856 

110721 H97678 Hs51319 ESTs 1556 

129426 AA412087 Hs.168272 EST; Highly smlr to prot inhJjitor of activated STAT prat PIASx-alpha nisapiens] 1.853 

112102 R44840 H&21303 ESTs 1552 

118502 N67317 Hsi0150 ESTs 0 1-852 

107619 AA0O4955 Hs.60015 ESTs 1551 

100436 D87446 Hs.75912 WAA0257 protein 1.85 

120652 AA287312 Hs.191648 ESTs 155 

121643 AA417078 Hs.193767 ESTs 1543 

117387 N26011 Hs53810 ESTs 1.643 

132084 Y12394 H&3886 karyopherin alpha 3 (importin alpha 4) 1.843 

124449 N48593 Hs.121820 ESTs 1541 

120263 AA173440 Hs.193919 ESTs 1538 

127226 AA731036 Hs5463 nbosomal protein S23 1338 
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111837 R36447 H&24453 ESTs 1-835 

12B727 M64174 H&50651 Janus kinase 1 (a protein tyrosine kinase) 1334 

114439 AA018937 Hs.128629 ESTs 1333 

102332 U35637 Human nebulln mRNA, partial cds 133 

126579 W72979 Hs.146082 ESTs 133 

102341 U37122 Hs3110 adduefn3(gainma) 133 

114246 Z39348 Hs.12079 ESTs 1328 

131757 D17S32 Hs316 DEAD/H (Asp-Gtu-Ala-Asp/Hls) box polypeptide 6 (RNA halioase; 54kD) 1323 

108904 AA136521 Hs.71148 ESTs; Weakly similar to putative p150 [H sapiens] 1323 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from clone DKFZp5S4C053) 1323 

131957 AA609008 Hs.183232 ESTs 1322 

100131 D12485 Hs.11951 phosphodiesterase l/rtudeofide pyrophosphatase 

1 (homologous to mouse Ly-41 antigen) 1322 

124163 H30539 Hs.189838 ESTs 1321 

118204 K59859 H&48443 ESTs 1321 

107727 AA016021 Hs.173091 DKFZP434K151 protein 132 

100357 D78156 Hs241548 RAS p21 protein activator 2 132 

116295 AA489016 Hs.91216 ESTs; Highly similar to partial COS; human putative tumor suppressor [Rsapiens] 1.82 

124833 R54112 Hs.128697 ESTs 1317 

122587 AA4532S5 Hs.6968 ESTs 1317 

114359 241589 Hs.153483 ESTs; Moderately similar to HI chloride channel [Usaplens] 1315 

111289 N72253 H&238248 ESTs 1313 

110826 N300S8 Hs.15347 ESTs 1312 

104106 AA422123 Hs.42457 ESTs 1311 

130043 AA055404 Hs.193853 ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY 0 [Rsapiens] 1253 

115864 AA432080 HS3120O ESTs 131 

129737 AA056140 Hs.122684 ESTs 131 

124477 N53158 Hs.102682 ESTs 1309 

100782 HG3740-HT4010 Basic Transcription Factor 2, 34 Kda Subunit 1306 

106101 AA421053 Hs34395 ESTs 1.806 

115479 AA287596 zs52h09.s1 NCLCGAP_GC81HsapienscDNAdoneIMAGE:701153 1.804 

116104 AA456635 Hs.78524 ESTs 1304 

114173 Z39050 Hs21963 ESTs 1304 

132632 N59764 Hs.S39a guanlna-monophosphate synthetase 1303 

119135 R49548 Hs.169681 death effector domain-containing 1302 

131559 N91087 H&28728 ESTs; Weakly similar to F55A12.9 [Celegans] 1301 

126922 AA177138 H&161671 ESTs 13 

117375 N25427 Hs.108812 ESTs 13 

103571 Z25535 H&211608 nucboporin 153kD 13 

105978 AA40S367 Hs.15973 ESTs 13 

125904 H22372 Hs.163586 ESTs 1.799 

133883 AA397915 Hs.77221 choline kinase 1.798 

105777 AA348412 H&23098 ESTs 1-797 

110166 H19480 Hs.174309 ESTs 1.796 

105038 AA130273 Hs.7584 ESTs; Weakly similar to hypothefical protein; similar to [Rsapiens] 1.796 

105427 AA251330 Hs28248 ESTs 1.795 

115278 AA279757 Hs.67466 ESTs; Weakly similar to BACN32G1 1.d pjnelanogaster] 1.794 

133104 L13698 Hs.65029 growm arrest-specific 1 1.794 

131170 N48674 H&23796 Human DNA sequence from clone 1052M9 on chromosome Xq25. Contains the 1.792 

100136 D13540 H&22868 protein tyrosine phosphatase; non-receptor type 1 1 1.791 

127263 AA331157 EST35035 Embryo, 6 weak, subtracted (total cONA) I Homo sapiens cONA 1.79 

114157 Z38878 H&24979 ESTs 1.79 

125601 AI096717 H&247043 KIAA0525 protein - 1.788 

118472 N66818 Hs.42179 ESTs 1.787 

112456 R63925 Hs£8464 ESTs 1.787 

130236 N69682 Hs31957 SC35-interacting protein 1 1.786 

133297 AA600057 Hs.70266 KIAA0905 protein 1.784 

125650 R40096 Hs.176578 ESTs 1.784 

132056 T89386 Hs38176 K1AA0606 protein; SCN Orcadian Oscillatory Protein (SCOP) 1.783 

129093 AA262710 Hs.108614 WAA0627 protein 1.783 

123176 AA489020 Hs.193424 ESTs 1.782 

106340 AA441792 H&22857 chord domain-eontalning protein 1 1.781 

100598 HG2463HT2559 Guanine Nudeotide-Binding Protein G25k 1.779 

104038 AA374532 EST86676HSC1 72 cells I Homo sapiens cONA 5" end, mRNA sequence 1.778 

122235 AA436475 Hs.190104 ESTs 1.777 

105104 AA151771 Hs.76941 ATPase; Na-tfl<+ transporting; beta 3 polypeptide 1.776 

107601 AA004638 H&50223 ESTs 1.776 

131467 W68255 H&27194 DKFZP434K171 protein 1.776 

118449 N66413 Hs.172466 ESTs; Weakly similar to KIAA0775 protein [Rsapiens] 1.776 
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1SS27 
32471 



27648 
06217 
31214 
05295 
06328 
24661 
22938 
15504 
05168 
29153 
105829 
01811 
00138 
24704 
22314 



06206 
07135 
05760 
06288 



29558 
17885 
07032 
24807 
00276 
10924 
33002 
32530 
10759 
06138 
07348 
15867 



13783 
34898 
32215 
04229 
16166 
15433 
14908 
27425 
31089 
13498 
16710 
27210 
20554 
29940 
17023 
11700 
16911 
06025 



11614 
34134 
06886 
17998 
21204 
21342 
31129 
16235 
02423 
10273 
08758 
10672 



AA034030 

AA342079 

T16305 

AA406105 

AA373091 

AA428379 

N26777 

AA435664 

AA436705 

N93797 

AA479166 

AA291946 

AA180208 

AA188618 

AA398290 

M86917 

013628 

R07335 

AA442257 

H02566 

AA428069 

AA620782 

AA338960 

AA435536 

AA304566 

AA234945 

N50112 

AA599472 

R45963 

042047 

N47938 

AF006082 

AA455917 

N21S71 

AA424515 

U43701 

AA432162 

AA194075 

W19222 

X98330 

T10132 

ABO02346 

AA461556 

AA284252 

AA236545 

AA470941 

Z38807 

T88908 

F10577 

R51476 

AA279654 

018242 

H88157 

R22212 

H72240 

AA412063 

AA101984 

R12581 

L76703 

M489086 

N52138 

AA4Q0422 

AA404995 

R27296 

AA479181 

U44754 

H29050 

AA127395 

H88477 



Hs.155212 

H&2S2055 

Ks.49349 

H&5344 

H&93832 

HS24870 

Hs.172635 

H&8583 

H&28020 

H&3090 

Hs.105633 

H&42736 

Hs.16606 

Hs.181481 

H&21965 

Ha24734 

H&2463 

Ks.192076 

Hs.191268 

H&89519 

Hs23247 

H&28170 

H&24338 

H&3542 

Hs.11360 

Hs.47023 

Hs247309 

H&233811 

Hs£2432 

Hs.62461 

H&50785 

Hs.19025 

H&33264 

Hs.184776 

Hs.165986 

Hs59908 

Hs.7041 

H&90821 

Hs.4233 

H&61289 

Hs502949 

H&58372 

H&54973 

Hs.143162 

H&22870 

Hs.189746 

Hs.70312 

Hs.194524 

Hs.13572 

H&41105 

Hs.23381 

H&39292 

HSJ6065 

Hsj61697 

Hs.191146 

Hs.173328 

HS36545 

HS53828 

Ha55896 

Hs.192430 

Hs23240 

Hs.186726 

Hs.179312 

Ha24096 

H&222414 

Hs.191178 



ESTs 

beta-ste APP-daaving enzyme 

adaptor-related protein complex 1 ; gamma 1 subunfi 

Homo sapiens dorm 24483 unknown mRNA; parted cds 

ESTs 

ESTs 

simitar to APOBEC1 

WAA0766 gene product 

EphBI 

ESTs 

ESTs 

ESTs; Highly similar to CQI-32 protein [H.sapiens] 

arladne; DrosophBa; homotog of 

ESTs 



ye98c1.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA done 
ESTs 

Homo saplans mRNA; cDNA DKFZp434N174 (horn done DKFZp434N174) 

MAA1046 protein 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

sucdnate-CoA Ogase; GOP-formlng; beta subunit 
ESTs; Weakly similar to ORF2 [M jtiuscuIus] 
WAA0089 protein 

yy84a09.s1 Soares_muIttple_sderosls_2NbHMSP Homo sapiens cDNA done 

ARP2 (actin-related protein 2; yeast) homotog 

SEC22; vesicle trafficking protein (S. cerevislae)-IIke 1 

ESTs 

ESTs 

ribosomal protein L23a 
DKF2P586B2022 protein 



ESTs; Weakly similar toll ALU SUBFAMILY SQ WARNING ENTRY I! [Haptens] 

ryanodine receptor 2 (cardiac) 

KIAA047B gene product 

synaptojanin 2 

K1AA1102 protein 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

yg76f04 xi Soares infant brain 1 NIB Homo sapiens cDNA done 
ESTs 

calcium modulating Ogand 

ESTs 

ESTs 

ESTs; Moderately similar to KIAA0745 protein [ftsapiens] 
ESTs 

G-prote!n coupled receptor 
ESTs 

protein phosphatase 2; regulatory subunit B (B56); epsta isotorm 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

small nuclear RNA acfivattng complex; polypeptide 1 ; 43M) 

ESTs 

ESTs 

ESTs 



1.775 

1.775 

1.776 

1.774 

1.774 

1.773 

1.773 

1.773 

1.772 

1.772 

1.772 

1.771 

1.767 

1.766 

1.764 

1.764 

1.764 

1.763 

1.762 

1.761 

1.758 

1.757 

1.756 

1.756 

1.756 

1.756 

1.754 

1.754 

1.753 

1.753 

1.751 

1.751 

1.75 

1.75 

1.75 

1.75 

1.749 

1.747 

1.747 

1.745 

1.744 

1.743 

1.743 

1.743 

1.742 

1.741 

1.739 

1.738 

1735 

1.733 

1.733 

1.732 

1.731 

1.731 

1.731 

1.728 

1.726 

1.726 

1.725 

1.725 

1.725 

1.725 

1.725 

1.725 

1.725 

1.724 

1.722 

1.722 

1.721 



221 



WO 02/30268 PCT/US01/32045 



120271 AA176404 Hs.111092 ESTs; Weakly similar to ZINC FINGER PROTEIN 136 [Rsaplens] 1.72 

100227 D28915 H&82316 Interferon-induced; hepatitis C-assoolated microtubular aggregate prol (44kD) 1.719 

129232 W69459 Hs.109655 sex comb on midlag (DnsophSaHika 1 1 .719 

134663 W73357 HsX750 ESTs 1.717 

104902 AA05547S Hs.104143 dathrfn; Dght polypepBde (Lea) 1.717 

120582 AA281290 Hs.125287 ESTs; Weakly similar to BC331 191 1[Hsap!ans] 1.717 

134891 F03517 HsX0787 ESTs 1716 

106219 AA428567 Hs26813 Homo saplans mflNA; cDNADKF2p586F1 323 (from done DKFZpS86F1323) 1.715 

116372 AAS21311 Hs.13854 ESTs 1.713 

107570 AA001870 H&237323 N-acetylgtucosarrtme-phosphate mutase; DKFZP434B187 protein 1.713 

106198 AA427816 Hs.11803 ESTs 1.712 

125136 W31479 Hs.129051 ESTs 1.712 

104973 AA085676 HsX763 KIAA0942 protein 1.712 

128710 J04813 Hs.104117 cytochrome P450; subfam3y 11IA (niphedtpina oxidase); polypeptide 5 1.711 

123994 D20899 Hs.107127 Homo sapiens mRNA; cOMA DKFZp564G022 (tram done DKFZp564G022) 1.711 

127871 AA766511 Hs.128848 ESTs 1.71 

116089 AA455933 Hs41324 ESTs 1.709 

123337 AAS04153 Hs.132797 ESTs; Weakly sMar to ORF YGLCSOw [S.oerevislae] 1.708 

123619 AA609200 Hs.162686 ESTs 1.708 

104781 AA026617 H&21610 ESTs; Highly sirfflar to BAll-assodated protein 1 {Haptens] 1.707 

115114 AA256468 HsX8148 ESTs 1.705 

117852 N49408 Hs.136102 K1AA0853 protein 1.705 

127644 T57570 Hs.77039 ribosomal protein S3A 1.704 

111359 N91273 H&27179 ESTs 1.702 

131721 L35644 HSJ1092 EphA5 1.7 

132438 F08925 Hs.48610 ESTs 1.7 
132476 N67192 Hs.49476 Homo sapiens done TUAB Cri-du-chat region mRNA 1.7 
130990 F02488 H&21917 WAAD768 protein 1.7 
128499 AA487503 Hs.100636 ESTs 1.698 
120780 AA342337 Hs241569 ESTs; Modify sn* to D ALU SUBFAMILY SQ WARNING ENTRY II [KsapiensJ 1.697 
.132920 L0S133 Hs.606 ATPase; Cu++ transporting; alpha polypeptide (Menkes syndrome) 1.696 
135037 U77948 Hs.184122 general transoiptton (actor II; i 1X98 
110024 H11297 Hs31050 ESTs 1.695 
134415 AA329274 H&82911 protein tyrosine phosphatase type IVA; member 2 1.694 
102223 U24685 Hs.148226 Human anihB ceS autoanUrady IgM heavy chain variable V-CkJ region (VH4) 

gene; done E1 1 ; VH4-63 non-productive reaffangement 1X94 

126712 AA205862 Hs.7942 ESTs " 1£94 

101507 M27492 Hs.82112 Marfaukin 1 receptor; type I 1j692 

106291 AA435551 Hs.30824 ESTs 1X91 

116826 H58691 Hs.8215 ESTs; Weakly similar to double-stranded RNA-binding nuclear 

protein ORSBP76 flisapiens] 1 X9 

135339 059269 Hs.127842 Homo sapiens mRNA full length Insert cONA clone EUROIMAGE 783648 1X9 

118250 N62602 yz75b6.s1 Soares_(nuflipla_sderosls_2NbHMSP Homo sapiens cDMA dona 

IMAGE288851 3 simBar to contains Atu repefifive element;, mRNA sequence 1X89 

106470 AA450116 H&186180 ESTs 1X88 

108203 AA057678 Hs.63408 ESTs 1X87 

119748 W70313 Hs.126906 ESTs 1.686 

116576 D51228 Hs.79404 neuron-speaTic protein 1X83 

123035 AA481392 Hs.105166 ESTs 1X83 

126668 AA011616 Hs.184086 ESTs 1.681 

101512 M28209 H&250716 RABt ; member RAS oncogene family 1X78 

102704 U76638 Hs£4089 BRCA1 associated RING domain 1 1X77 

126218 AA256386 Hs. 13649 Novel human gene mapping to chomosome 1 3; sMarto rat RhoGAP 1X76 

111180 N67277 Hs.9403 ESTs 1X76 

105937 AA404342 Hs.173531 ESTs 1X75 

114118 Z38520 Hs.175930 ESTs 1X75 

109203 AA190834 Hs.108787 endoplasmic reticulum membrane protein 1.675 

125245 W86608 Hs.7243 ublquitin specific protease 24 1X75 

102906 X03956 Hs.75318 tubulin; alpha 1 (testis specific) 1.675 

125914 AA262925 Hs.180034 deavage stimulation tactor; 3f pre-RNA; subunit 3; 77kD 1.674 

134294 U63289 HsX1248 CUG triplet repeat RNA-binding protein 1 1X74 

109742 F10108 Hs.183333 ESTs 1.673 

134674 D63876 HSX7726 K1AA0154 protein 1.673 

104079 AA402937 H&10323B ESTs 1.671 

107554 AA001386 H&59844 ESTs 1.671 

132439 AA243139 Hs.4863 Homo sapiens done 25088 mRNA sequence 1.669 
124515 N58172 Hs.109370 ESTs 1.668 
124300 H92575 Hs.105959 ESTs; Weakly similar to B ALU SUBFAMILY SQ WARNING ENTRY II [H.sapiens] 1.668 
126609 AA743475 Hs.171693 ESTs 1.687 
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106095 AA419547 Hs.11713 ESTs 1.664 

101754 M77142 H&239489 TIA1 cytotoxic graraile-assodatedRN/Wilndingprotain 1.663 

105188 AA192306 H&23926 ESTs 1.663 

113582 T91371 Hs.16824 EST 1j661 

119559 W38197 Accession not listed in Genbank 1£61 

119961 W37535 H&59015 ring finger protein 9 1.657 

123255 AA490890 Hs.105273 ESTs 1.657 

111078 N59230 Hs.186574 ESTs 1.655 

113082 T40528 HS5246 ESTs 1.654 

11S5B9 W44692 Hs.124177 ESTs ' 1.652 

104308 £353639 Hs.77904 ribosomat proteh S26 1.65 

103073 X59417 Hs.74077 proteasome (prosome; macropain) subunK; alpha type; 6 1.65 

124424 N35314 Hs.107265 ESTs 1.65 

128890 AA096157 Hs.182364 ESTs; Weakly slmflar to 25kDa trypsin Inhibitor [H^apfens] i£5 
119400 T92767 ye27d0fis1 Stratagene lung (#937210) Homo sapiens cONA dona 

IMAGE1189553 , .mnMAsequencG. 1.65 

131631 AA486868 H&29802 slit (OrosophDa) honmlog 2 1.65 

118229 N62339 Hs.180532 heat shodt 90kD protaln 1 ; alpha 1.649 

118533 N67954 Hs.49413 ESTs 1.648 

130666 AA476307 Hs.194035 KIAA0737 gene product 1.647 

103093 X60708 Hs.44926 dipeptidylpeptidasa IV (C026; adenosine deaminase complexlng protein 2) 1.647 

128667 U69140 Hs.103419 fascteula&on and elongation protein zeta 2 (zygln II) 1.646 

112933 T15530 H&221439 ESTs . 1646 

114546 AA056263 Hs.132747 ESTs 1.645 

128705 AA579377 Hs.180532 haat shock 90kD protein 1 ; alpha 1.644 

114399 AA007595 Hs220937 ESTs 1.642 

118836 N79820 H&50854 ESTs 1.64 

100401 085423 Homo sapiens mRNA for Cdc5, partial cds 1.64 

105681 AA284365 Hs.171228 WAA1040 protein 1.639 

132526 AA460128 Hs5074 similar to &pombedm1+ 1.639 

133809 AA034002 Hs.76359 catalase 1.639 

115968 AA447083 Hs.134522 ESTs 1.637 

116370 AA521256 Hs.236204 ESTs; Moderately similar to NUCLEAR PORE COMPLEX 

PROTEIN NUP107 [Rjwrvegiais] 1.631 

109644 F04477 H&204802 ESTs; Moderately similar to GLYCERALDEHYDE 3-PHOSPHATE 

DEHYDROGENASE; LIVER [H^apiens] 1.627 

103427 X97303 H.sapiens mRNA for Ptg-12 protein 1.627 

132186 T33888 H&221040 KIAA1033 protein 1.626 

131428 U17838 H&26719 PR domain containing 2; with ZNF domain 1.626 

126638 AA649257 Hs.183602 ESTs 1.625 

114503 AAQ39568 Hs.188083 ESTs 1.625 

121242 AA400857 Hs.97509 EST 1.625 

122414 AA446885 Hs.99087 ESTs; M«terately similar to ZINC FINGER PROTEIN 141 [H^apiens] 1.625 

110632 H72344 Hs.171635 ESTs 1^24 

111389 N95837 Hs.169111 ESTs; Weakly similar to L82A [0 jnelanogaster] 1.624 

112449 R63802 Hs.124186 ring finger protein 2 ' 1j623 

113070 T33464 Hs.6298 ESTs 1.622 

107229 059284 HsJ4644 ESTs 1.618 

132710 W93726 K&55279 protease inhibitor 5 (maspin) 1.617 

124S64 N94814 Hs.33540 ESTs; Weakly similar to KIAA0765 protein [H.sapiens] 1.617 

130166 AA350690 Hs.151411 KIAA0918 protein 1.616 

125040 T78451 Hs.199961 ESTs 1.615 

132972 H39627 Hs.164967 ESTs; Weakly similar to II ALU SUBFAMILY SB WARNING ENTRY U [H^apiens] 1.615 

115873 AA433916 Hs.90093 heat shock 70kD protein 4 1.611 

120408 AA235045 Hs.190151 ESTs 1.61 

120934 AA383773 Hs.191500 ESTs 1.61 

115259 AA279071 Hs.13453 spTicing factor 3b; subunit 1; 155kO 1.609 

134330 D20113 Hs£185 ESTs; Highly similar to CGI-44 protein [H.sapiens] 1.607 

115117 AA256492 Hs.49007 polyM polymerase 1.606 

125162 W44682 Hs.109896 ESTs 1.605 

103946 AA285246 Hs.1 1 1650 ESTs; Weakly simitar to Prtl homotog [Rsaplens] 1 .604 

133389 AA166917 Hs.72639 ESTs 1.603 

115528 AA342301 HsS3929 ESTs; Weakly similar to 1! ALU CLASS B WARNING ENTRY I! [Rsaplans] 1.602 

129704 W81301 Hs.12064 ublquitin specific protease 22 1.602 

109313 AA206800 Hs£6276 ESTs; Moderately similar to zinc finger proteh dp [H^apiens] 1X01 

130457 U58091 Hs.155976 cuEn4B 1.6 

123076 AA485211 Hs.190046 ESTs 1.6 

115113 AA255460 Hs.44610 ESTs U 

117731 N46433 Hs.46609 ESTs 1.6 
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110455 H52172 

119780 W72967 
126983 AA211537 



AA504338 Hs.171857 ESTs 1599 

Hs3238 adenovirus 5E1A binding protein 1587 

Hs.151791 KIAA0092gane product 1-596 

Hs.72324 ESTs; Highly similar to unknown fKsaplens] 1596 

Hs.199832 ESTs 1596 

Hs.10130 ESTs 1594 
yb68f02.s1 Stratagem ovaiy (#937217) Homo sapiens cDNA done 

IMAQE:763473',mRNA sequence. 1592 

Hs.10176 ESTs 1589 
yt85e8.s1 Soares_plnBal_gknd_N3HPG Homo sapiens cDNA done 

IMAGE231 1 1 3* sMar to contains Alu repetitive element;, mRNA sequence 1589 

Hs.191381 EST s; Weakly similar to hypothetical protein [tisapians] 1587 
zn55d01 jl Stratagena muscle 937209 Homo sapiens cDNA clone 

IMAGE562081 5*, mRNA sequence. 1586 

134675 AA250745 Hs57773 protein kinase; cAMP-dependent; catalytic; beta 1584 
105431 AA252033 Hs.15036 ESTs; WaaWy similar to U AUU SUBFAMILY J WARNING ENTRY U [H-sapiens] 1584 

120187 Z40251 Hs56974 ESTs 1584 

115830 AA428137 Hs56434 ESTs 1581 

135069 AA456311 Hs53961 ESTs; Weakly similar to I! ALU CLASS A WARNING ENTRY II [H^aplens] 1581 

122997 AA4792S5 Hs.106290 Kelch motif containing protein 1581 

119707 W67569 Hs.44143 ESTs; Weakly similar to SNF2alpha protein [H.sapiens] 158 

131934 D80948 Hs.34922 ESTs 158 

106141 AA424558 Hs5302 phosducMke 158 

115271 AA279422 Hs5724 ESTs 1579 

131468 R27598 H&27197 KIAA0797 protein 1577 

131165 R98173 H&23763 Max-interacting protein 1575 

117273 N21680 Hs.43047 ESTs 1575 

101569 M33772 Hs.182421 troponin C2; fast 1575 

118127 AA459703 Hs.79070 v-myc avian myelocytomatosls viral oncogens homclog 1575 

120022 WS0625 HS58432 ESTs 1575 

117512 N32157 Hs52207 ESTs 1574 

106511 AA452865 Hs206713 UDP-GatbetaGlcNAc beta 1;4-galactosy!transferase; polypeptide 2 1573 

116415 AA609204 Hs.27973 KIAA0374 protein 1573 

127879 AA810215 Hs.189079 ESTs 1571 
125211 W72798 Hs.103177 ESTs; WWy smlr to cDNA EST EMBLD32579 comes from this gene (Celagans] 1571 

114746 AA135638 HsJ223756 ESTs 1.571 

122698 AA456112 Hs.99410 ESTs 157 

116765 H12638 Hs.121585 ESTs; Weakly similar to reverse transcriptase [H^aplens] 1568 

130895 AA609828 Hs21015 ESTs; Highly similar to tetracycline transporter-like protein [Mjnusculus] 1568 

114338 Z41366 Hs.40109 KIAA0872 protein 1567 

111005 N53076 H.5996 ESTs 1567 

128135 AA913491 Hs.189143 ESTs; ModrUy smlr to It ALU SUBFAMILY J WARNING ENTRY 0 [H.sapiens] 1567 

112046 R43365 Hs.22273 ESTs 1566 

132160 AA281770 Hs.184081 seven in absentia (Drosophlla) homotog 1 1566 

111568 R10153 Hs20561 ESTs 1566 

127775 H04106 Hs.179902 ESTs; Weakly similar to NG22 [Ksapiens] 1566 

115359 AA281936 Hs.88914 ESTs 1566 

121845 AA425734 Hs.165066 ESTs; Weakly similar to hypothetical protein [H^aplens] 1565 
127854 AA769520 ESTs; Weakly similar to REGULATOR OF MITOTIC SPINDLE 

ASSEMBLY 1 [Hsapiens] 1564 

120287 AA187679 Hs.111114 ESTs 1563 

114940 AA243012 Hs.75928 ESTs 1562 

126716 AA031700 Hs251962 ESTs " 1562 

134161 U97188 Hs.79440 IGF-H mRNA-binding protein 3 1561 

125390 H95094 Hs.75187 translocasa of outer mitochondrial membrane 20 (yeast) homolog 1561 

115334 AA281244 Hs55300 ESTs 1559 

113721 T97931 Hs.18190 EST 1558 

114895 AA236177 Hs.76591 KIAA0887 protein 1558 

119341 T62571 Hs.146388 mterotubute-assodated protein 7 1558 

108012 AA039616 Hs.61933 ESTs 1558 

130335 AA156499 Hs.8454 protein kinase; cAMP-dependent. regulatory; type II; alpha 1557 

134351 R82074 Hs52109 syndecanl 1557 

133300 D51401 Hs.70333 ESTs 1553 

106920 AA490899 Hs24462 ESTs 1553 

118744 N74075 Hs.94293 EST 1552 

126489 W20016 Hs.144228 ESTs; Weakly simflar to ZINC FINGER PROTEIN 83 [H^apiens] 155 

115913 AA438720 Hs.65487 ESTs 155 

107868 AA025234 Hs.61260 ESTs 155 

134520 N21407 Hs257325 ESTs 155 
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109703 F09684 H&24782 ESTs; Weakly similar to ORF YOR283W [S.cerev!s!aa] 155 

120288 AA187S38 H&55189 ESTs; Weakly similar to F25B53 [Oelegarrs] 1548 

105356 AA443277 Hs51034 peroxisomal biogenesis factor 11A 1548 

129460 AA235627 Hs.11171 APG5 (autophagy 5; S. cerevislaeHike 1547 

133950 011861 Hs.77823 ESTs 1546 

128172 AI400862 Hs.142607 ESTs 1546 

114162 238909 H&22265 ESTs 1545 

101803 M8S546 Hs.155691 pre-B-cell leukemia transcription factor 1 1544 

113617 T83630 Hs.17207 ESTs 1542 

104896 AA054228 Hs23165 ESTs 1541 

114477 AA032013 Hs.144260 EST 154 

110731 H98653 Hs.188006 KIAA0878 protein 154 

130367 Z38501 H&8768 ESTs; WWy smlr to ft ALU SUBFAMILY SQ WARNING ENTRY U [H.sapiens] 1538 

130539 L07044 Hs250857 Homo sapiens calctumlcalmodulln-dependent protein kinase II mRNA; partial cds 1538 

134921 W60186 Hs.169487 Kreister (mouse) maf-relatsd leucine zipper homotog ' 1537 

130583 W24957 Hs.16281 ESTs; Moderately similar to similar to Celegans protein 

encoded In cosmW T20D3 [H-saplens] 1537 

133723 AA088851 Hs.75744 S-adenosytmathionine decarboxylase 1 1537 

106450 AA449469 Hs.11859 ESTs 1536 

104120 AA429838 Hs59519 KIAA1046 protein 1536 

100533 HG18794IT1919 Ras-Uke Protein Tc10 1535 

130664 R09049 Hs.17625 ESTs 1535 

127122 AA279153 Hs.190049 ESTs 1535 

134264 T03391 H&8087 ESTs 1535 

132319 AA418662 Hs.44625 ESTs 1535 

115465 AA286941 Hs.43691 ESTs 1533 

125003 T59442 Hs.100445 ESTs 1532 

102273 U30888 Hs.75981 ublquitin specific protease 14 (IRNA-guanine transglycosylasa) 1532 

121875 AA426299 HsSffilO ESTs 1532 

114366 Z41747 Hs.469 succinate dehydrogenase complex; subunit A; ftavoproteln (Fp) 1531 

132944 AA054515 Hs.6127 EST s; Weakly sMar to prostate-specffc transglutaminase [H.saplens] 153 

111199 N68210 HS29822 ESTs 153 

113494 T88878 Hs358738 ESTs 1529 

129515 AA490882 Hs.112227 ESTs 1528 

133124 AA156049 Hs.65490 ESTs 1528 

104785 AA027163 Hs.7942 ESTs 1526 

105595 AA279408 Hs25866 ESTs 1526 

130198 U67156 Hs.151988 mitogen-activated protein Idnase kinase kinase 5 1526 

114297 Z40758 Hs.173091 DKFZP434K151 protein 1525 

112876 T03488 Hs.4842 ESTs 1525 

127500 AA525014 Hs.162115 ESTs 1525 

120519 AA258585 Hs.129887 carlherh 19 (NOTE redefmition of symbol) 1525 

119859 W80702 Hs58461 ESTs 1525 

129944 L00389 Hs.1361 cytochrome P450; subfamily I (aromatic compound-inducajle); polypeptide 2 1524 

118864 N89S70 H&42148 ESTs; Weakly similar to Su(P) p jnelanogaster] 1523 

123964 C13961 H&210115 EST 1523 

111676 R19414 Hs.166459 ESTs 1522 

128332 AI079523 Hs.134173 ESTs 1522 

130455 X17059 Hs.155956 N-acetyitransIerase 1 (arytamine N-ecetyltransferase) 1521 

125181 W58461 Hs.12396 ESTs 1521 
127093 AA768241 oa72d02^1 NCLCGAP_6CB1 Homo sapiens cONA clone 

IMAGE13177953\mRNAsequence. 1521 

132156 AA157401 Hs.4113 S-adenosylhomocysteina hydrolase-Cke 1 - 1521 

125303 Z39821 Hs.107295 ESTs 152 

132697 AA28195I Hs5518 Homo sapiens mRNA; cONA DKFZp566J2146 (from clone DKFZp566J2146) 152 

117086 H93135 Hs.41840 ESTs 1519 

113355 T79203 Hs.14480 ESTs 1518 

108621 AA101811 Hs.69506 ESTs 1518 

109384 AA219172 Hs56849 EST 1518 

128510 X94703 Hs.100816 RAB28; member RAS oncogene family 1517 

132968 N77151 Hs.61638 myosin X 1515 

117035 H88798 Hs.41182 ESTs 1515 

116781 H22985 Hs52132 ESTs 1513 

108677 AA115629 Hs.118531 ESTs 1513 

130214 H78003 Hs.15266 ESTs 1513 

134700 AA481414 H&8868 gokjl SNAP receptor complex member 1 1512 

116616 080783 Hs.45224 ESTs 1508 

126257 N99638 tumor necrosis factor receptor superfamtly; mambar 10b 1508 

125859 AA806808 Hs.118797 ubiqirflin-conjugafing enzyme E2D 3 (homologous to yeast UBC4/5) 1508 
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113837 W57698 HS5888 ESTs 1507 

114317 Z41038 H&469 suodrata dehydrogenase complex; subunit A; ftavoprotetn (Fp) 1507 

100311 D50640 Hs.184653 phosphodiesterase 3B; cGMP-inrubfed 1507 

126802 AA947601 Hs57056 ESTs 1506 

128861 R82837 Hs.103329 KIAA0970 protein 1506 

134194 AA233231 Hs.78828 ESTs 1506 

108953 AA1498S2 Hs.42128 ESTs 1504 

133240 D31161 Hs.68813 ESTs 1502 

132671 X76302 Hs54649 putative nucleic acid binding protein RY-1 1501 

132609 Z48923 Hs53250 bona moiphogeneflc protein receptor; type II (serine/threonine kinase) 1501 

105574 AA278678 H&258567 ESTs ' 15 

113718 T97782 Hs.256268 ESTs 15 

127824 AE08365 Hs.127811 ESTs 15 

130132 U55936 Hs.184376 synaptosofflatassocbted protein; 23kD 15 
127394 AA453224 ESTs; WeaMy similar b tl ALU SUBFAMILY J WARNING ENTRY U [Haptens] 15 

100485 HG1111-HT1111 Ras-Ute Protein Tc21 15 

101078 L04510 Ha.782 ADP-ribosylafai factor domain protein 1 ; 64kD 15 

128611 AA456845 Hs.102471 KIAA0680 gene product 15 
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TABLE 12A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 12. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Ptey. Unique Eos probeset WeniSer number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



108538 119811 1 AA084524AA339253AW966289 

117040 46956J AW970600 AA503323 K89218 AF086031 H891 12 

100782 184S7J AA355435 NM.001516 Z30093 T28405 AW949486 AA461142 AM10532 A1652073 AA521208 AH70141 AI9S8234 AI026102 

AA713583 AW13S876 AA936814 AA770300 AI242635 AA377033 AW960263 AW607683 AI273603 AA410287 AI040513 
AA460838 AI803916 AVC94095 AW449680 AW798677 AW675048 BE542116 AL120521 

100819 3022J L34840 NMJD03241 U31905 A1546931 AI791616 AI973065 AJ792321 AI546937 AI685880 AI732835 AI682360 AA420653 

AA564047 A1682323 A1824614 AI6S9889 AI680052 AI970887 A1623108 AA420692 AI418074 AA631018 AI810595 AW291463 
AW449930 AI&S8908 AI970818 

100824 S38 A1393237A1521317AI761348AR)25841 D43968 AW994987 L34598 AF025841 D89789 D89788D89790AW998S32 

A1971742 AI310238 XS0976 AW139668 AW674280 A1365552 AA877452 AV657554 C75229 AA376077 AI798056 AW609213 
W25586 H30149 BE075089 BE07S190 AW5808S8 H99S98 AA425238 AA133916 AW363478 BE158121 BE158127 
AW467S60 BE158135 BE158126 BE158145 M92860 AA847246 A1951688 AI361423 AA878154 AA043767 AB63712 
A1559226 AW339007 A1371266 A1388901 AA046624 AA134739 AW449154 M130232 AMS8720 AA96251 1 AI700627 
R70437 AW004008 AA045229 A1671572 H99599 AA043768 AI6854S4 AI871685 N29937 X90977 AA524240 AI1421 14 
AI825750 A1567805 AI631365 AI347893 AA134740 F20669 AA046707 AW793216 AW963298 AW959380 AA363265 
AJ784593 AI268201 R69451 AV6S7818 AI695588 

125004 264197J BE312163 AJ230793 AA374482 A1926059 AA622653 A18S0704 BE139185 AW296884 T6Q238 T50120 

102313 27608.1 U33921 AI190489AAS73311 

102337 553_1 AI814663 AA806761 AA765241 AA019317 AAQ92255 AA035405 T85079 AA890151 Ai373959 T85080 BE1S3728 AA740848 

BE080682 AUM8137 AW182316 AI699468 AW274481 AW407538 AA306562 AW950024 AVW49943 AUJ45703 AW843198 
W25132 BE612794 AA304266 AW958054 H25673 AV646563 AV646573 BE172990 AW593488 AA38S181 AA164998 
AI246476 AA345406 AI277554AA134749AA8S6824 BE613247 AA299003 AL048138 AA028121 T82510AI923835 
AWD20440AI401594AI889401 N93290AA044247 AA028100AI582845AA811151 AI741811 AI925878 AA448277 AA172221 
AI21 4783 BE220793 AA022746 AI082882 AA022849 AI928385 AA573472 A1420686 AW0729Q2 AI799493 AI873506 
A1468977 AI192079 AI468976 AA044272 AW015701 AW316979 AA933042 AA6O9017 AI318333 A1424571 A1934945 
AA172023 AW050917 AA848180 AA134748 AI003947 AI766769 AW006697 AA653517 AW575680 AI474214 AA401478 
U36922 AA927084 AA868000 062654 T91745 AW500202 AA194764 AA74S346 AA130464 AW1 17498 AA054526 N26432 
K02534 K049S4 AW303367 BE300931 A1218049 AI208073 AW182749 AA983630 AI147585 AA194765 AA054534 AA922720 
AI436585 AB4653S AA134269 AA280923 AA8S7422 AA019559 AW274010 AA035406 AA917879 H99327 W32908 AI216048 
AW49S823 AA019414 H82288 W35284 AJ93S621 AI7671 13 AA866177 AW357874 H82398 AF032885 AW300151 AW467069 
AA809348 AI188507 A1494178 AA872752 AI631631 UQ2310 NM_002015 AA815006 AI382453 AW197658 A1761654 
AI8043SS A1382221 AI813640 AI439635 A1523901 AW517242 AK21705 AW298104 AW204560 AVW73095 AW028783 
AW014650 AI766744 AI808294 AI698758 AI041809 AI766667 AI479103 AA872797 AA769305 AA765080 AA334166 
AI472322 

124704 292319.1 R07335 R07840 

116988 185904.1 AW953679 AW953S80AA244436H82527AA381046 AA244483 H82526 
124825 330773 1 AA501669 R52088 
110455 46874.1 H52576AF085971 H52172 

126257 182217 1 N99638 AW973750 AA328271 H90994 AA558020 AA234435 N59599 R94815 
125624 154135 1 AW968363 AA465492 R34539 AA1 65411 
104038 264235 1 AA374532 AA421255 

103427 43892.1 BE514383 AA071273 AW247987 AW673286 BE312102 AW749824 BE071985 AW577383 BE071945 BE072005 AW577355 

BE0719S5 AW239231 BEO72O0O BE071960 AW577360 AW749830 AW373020 X97303 AW899522 BEO0O192 BE562219 
BE266655BE264970 

104142 113242.1 AA074713AA447006 

127093 47721 1 AW977549 AA256038 AL365415 AW500455 M768241 AW968097 Z17349 AA256104 
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125873 10492.1 
125954 4457J 



125992 15B9048J 
127210 15307_6 



127263 232161J 
135197 29440J 

127394 304844J 

126379 1S60_2 

126983 171841J 

120470 188975.1 

127854 443883J 

121367 280429J 

106320 6435J 



115479 201515J 
101026 11075J 

100401 24827 1 



130542 28039J3 



100485 30576.2 

108345 112277JB 
100522 19669.1 

100533 32905J 

100598 23902.2 



102332 14745.3 

118250 genbank_N626Q2 

103678 entrez_Z84483 

119400 gsnbanK_T92767 

119559 enlre?.W38197 



AW271838 AL133605 C01 646 H29959 AA999896 D60676 AW999454 AW961 176 AA31 5244 H14437 AW3861 18 N46512 
AW272021 AI768516 BE466421 AI082809 AI804454 AA905101 AW173368 N38942 AW614169 AI080483 N29489 AI500550 
AA994475 AAB14464 AA707388 AA593145 AA569473 AW627815 AI828244 N63226 N42300 

NMJ016353 AB023584 W44753 R09565 AA382865 R23772 AI814257 AA974046 AK001608 AI935638 AW440609 AI420022 
AA777386 AAB06969 AI554876 AI584006 AB88556 AI688634 AI697997 AI014540 AI806683 AI741202 AW263154 
AW297238 AI149951 AI589076 AW082158 AWB14285 AA931887 AA781989 R09490 AA484643 AE07121 AI088390 
AI538065 AI619547 AI741925 AI702846 H40846 R93943 AW747979 AA461348 U30163 AA326023 AB35992 AW242870 
A1244025 AE22558 W38425 AW473630 AI624599 AI921226 AI683152 AKB6458 AI123822 AW170802 C16447 AQ37674 
D25726 AW339366 AW771259 AA461174 
H48372 W01626 
AA305278AA223833 

110924 6443.1 AW058463 AF195766 AAS80145 T86901 W60373 W60281 NM_007222 AF1 06862 AKM0785 AA167188 
AW884503 AW891313 AW891332 AW891312 AI984924 AI123518 N75170 AA131614 H25330 AI913358 AI742277 W25576 
R53771 AW445159 AW888628 AW888627 AW274674 AI088482 N52314 N34282 AW001769 AI338943 T66784 AI288963 
AW468676 AW237528 H25289 N71690 AA610128 AI143458 AI082599 N49144 AAS54773 AW66341 1 AW810151 N47938 
AW601626 AA167189 AA918304 AA805205 BE069496 AA652836 BE069499 AI699298 AW249926 AW888578 BB67635 
T10726 AW604715 D54245 053062 D55610 D55555 AA301376 Al 133498 N77788 A1936320 AW090734 A1269977 N50828 
AA550814 AI421993 AKM5384 N50813 060292 D59349 AA131710 D81698 081699 
AA331 156 AA331 157 AA331 155 

U76456 NM.003256 AF057532 AA193414 AW293304 AW963378 AA313095 AI359841 AI969312 AI080163 AW448926 

AI671 136 BE466399 AI637967 AK71873 AW198S83 AW071635 AI634427 AW296B72 AW292470 M193650 

BE161832 AA453224 AA485772 

D90391 M55575 AI652268 AA719776 

AA524886 AW971347 AA211537 

AW971327 AA524988 AW628653 AA251797 

AW976796AA769520 

AA432071 AA405648 AWOO09O8 T16347 

AB028957AL120001 AE67678 H10928 R19844 AW970334 AA393182 F05472F11711 H09908 N50250 AI815411 BE463679 
D61468 AW970253 D60889 C15548 D61011 D60867 A1815795 AA534831 D81386 AW235039 A1382158 D81174 AA416899 
AA852310 H09789 H10929 H09813 F09369 R44721 D51515 Z38456 R14004 T66255 F12148 F12139 AW351702 M85350 
A101 8713 AW972450 AW972645 AA514964 T66172 F09785 F09776 AA436608 T05327 T071 1 8 AA339352 
AW301 608 N4670S AA649093 AA287595 AWB1 1753 AA287596 N39260 

NMJD01874 J04970 T91426 AW205201 T84979 AA255727 AA847837 R02164 T91339 AV651884 AV651835 AV651350 
AV6501 18 AV651333 A1272002 A1367796 AA830651 AA2621 12 AW151 198 

AU076696 AA219720 AL135197 AA305877 N58376 AA318063 M130725 AW954903 BE541230 AW383312 U86753 085423 
AI679458 AI122932 AB007892 AB83919 BE160134 F08104 R34903 F13440 AA095444 AA262453 AA191036 R17895 
T81266 BE149776 AI279537 AI1431 13 AA361072 AW959030 AW268817 AA81 1533 BE275179 AI221677 T65147 R49293 
AA249176 BE00O290 AA768053 F09494 BE092645 BE172099 Z41 177 AA044750 AB09768 BE140795 BE140574 AW845210 
AW752452 BE243244 AA843664 A1300C80 BE169032 AW189979 BE004869 AA621872 AI951772 AI678897 AB26598 
N62813 AJ350912 AW608791 AI309602 AB83138 AWB75592 AI655073 AW875626 AA130606 AI370827 C75528 C75554 
AW263335 AI344426 BE004788 AA576220 AA604824 A1431405 AA749378 R38882 AW955075 AA173821 C75657 
AA219672 AW768408 R43141 A1431414 AA483343 A1673792 T17294 AW770187 N74285 A1476404 AI088288 AA654152 
AW974864 BE617311 BE243328 BE168049 

U64675 AW1 67507 AW167508 BE218568 AA779360 W85722 AL044843 BE159404 AF012086 AW89861 1 AW898610 
BE1S9405 BE092191 AW890826 AW369841 AW368064 AW606702 AUM4731 R82691 AA419346 AA41655B H96045 
AL040450 A1640531 AI808434 AUM661 3 AW855784 AW362469 AUM8881 AL049015 AA094272 AA888908 AA417294 
AW237786 R59793 AL044916 082402 AI216654 AI079342 H96406 AL037845 AI915900 AA972133 AI478783 T31074 
Z21 135 Z21396 AA352182 R13918 AA430178 C17811 AI371824 AI742256 AA926801 N79156 AA350610 AA081971 N83839 
R35544 AA312292 AW952080 N42322 AA171957 AA565297 R89207 AA504106 AI630782 AA826482 A1301579 T36241 
AW966618 Z28426 AL043480 AI124636 AA393449T19504 AW887823 AI289814 N53979 AL043571 AK32764 AI859613 
A1936308 AI683212 A1984499 AI133258 C05893 AW512761 AI041260BE466240Z19161 AI351190 N67549AI373374 
AA400873 AW440914 AW51 4879 AA770146 AI358754 R51113 AK83773 AA649886 T30543 D54358 R37750 T03358 
T15451 T15880 AA999689 N67396 AI056289 T85597 N62441 R89099 R00035 T85596 R61335 R00128 N63359 AI535964 
A1207768 M31468 NW.012250 W01322 AA253280 AA253233 AA293148 AW582106 R79880 AA459547 AA363459 
AA234396 N31669 H44468 AA434587 AW3S3088 AW993541 
AA070906AA070934 

X51501 NM.002652 Y10179 J03460 AI791618 AI821473 AA916588 AA564296 AA9161 10 AI972286 AI420470 AI568790 
A1597724 AW205207 AI659305 AI791620 AA532383 AI821475 AA526498 

NM.012249 M31470 AL043108 AA262561 AA178883 729433 AA313329 W48807 AW404323 AA453560 AW403227 H94816 
W17101 AA165152 W23989 AA091310 

AL121734 D54896 AA424269 BE242906 AA3621 18 BE018454 AI280348 AL048769 M35543 AA757734 AI128865 H20289 

H23728 A1203445 H41481 H18237 H44081 H92839 AI928621 H75675 051 148 AI796198 AW390453 055579 D54145 D53996 

054015 R37664 H17541 AA6S8681 T65061 R15867 AW468123 R16049 H69030 AA054226 H16070 F09655 R92144 T03521 

R05473 H92B40 AA018186 R91707 

U35637AA112989Z19308 

N62602 

Z84483 

T92767 

W38197 
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TABLE 13: shows genes, including expression sequence tagsj up-regulated in prostate tumor 
tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 GeneChip 
array. Shown are the ratios of "average" normal prostate to "average" prostate cancer tissues. 
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Pkey: 




Unique Eosprobeset identifier number 




ExAccn: 




Exemplar Accession number, Genbank accession number 




UnlgensID: 


Unigene number 




undone 


Title: 


Unigene gene fitla 




R1: 




Background subtracted rtomial prostata : prostata tumor tissue 




Pfcey 


ExAccn 


UntgenetD Unigene Tine 


R1 


333516 




CH22_FGENES.173_1 


0X128 


337954 




CH22_EMJ\C005500.GENSCAN5M 


0.029 


332496 


R73299 


Hs204354 ras homolog gene famfly; member B 


0X0 


337944 




CH22_EIAAC005500.GENSCAH89-7 


0XB3 


334111 




CH22_FGENE&330 10 


0.033 


333657 




CH22_FGENE&241_2 


0.034 ■=■ 


327718 




CHXM_hsg!|65252B4 


0.034 


336355 




CH22_R3ENESX417_5 


0.035 


322011 


AL1 37354 


EST cluster (not in UnlQene) 


0.035 


336377 




CH22_FGENESXei 5 


0036 


300254 


AW079607 


Hs.188417 ESTs;WeaWy similar to ZnT-3[H.sapiens] 


0X07 


330096 




CH.19_p2 $6015278 


0X137 


335191 




CH22J=GENE&507_6 


0XBB 


334040 




CH22_FGENE&322_8 


0.039 


333586 




CH22_FGENES.204 2 


0.04 


333295 




CH22J=GENES.132J2 


0.042 


313326 


AI08B120 


Hs.122329 ESTs 


0.043 


329517 




CH.10_p2gi|3983513 


0.043 


333403 




CH22_FGENES.144_?1 


0X143 


335226 




CH22_FGENESS13 11 


0.044 


335976 




CH22_FGENESX62_11 


0X145 


333637 




CH22 FGENES229_2 


0.046 


334582 




CH22_FQENES.407_5 


0.045 


336437 




CH22_FGENES£28_4 


0.047 


337461 




CH22_FGENES.782-1 


0X147 


302892 


N58545 


Hs.6975 hlstone deacetylase 3 


0X149 


338689 




CH22_EMAC005500.GENSCAN.47M 


0X149 


334721 




CH22_FGENES.421_32 


0X149 


305867 


AA864572 


EST singleton (not in UniQene) with exon hit 


0.049 


335498 




CH22_FGENES£71 7 


0X15 


311596 


AI682088 


Hs323368 ESTs 


0X6 


326959 




CR21_hsgi|6469838 


0.051 


311688 


AWQ25661 


Hs-240090 ESTs 


0.052 


317298 


AI922374 


Hs.158549 ESTs 


0X152 


332984 




CH22_FGENES54_6 


0.052 


321039 


AW247083 


EST cluster (not in UniQene) 


0.053 


335844 




CH22_FGENESXi23 4 


0.053 


325371 




CH.12_hsgil5866920 


0.054 


335667 




CH22_FGENES590 18 


0.054 


333635 




CH22 FGENES.228_2 


0.054 


338736 




CH22_FGENES.110-2 


0.055 


335893 




CH22_FGENES.63S 1 


0.055 


333170 




CH22_FGENES.94 5 


0.055 


329768 




CH.14_p2gi|6015501 


0.055 


334030 




CH22_FGENES.320_2 


0X155 


323359 


AA234172 


Hs.137418 ESTs 


0.055 


300453 


AW051431 


Hs.1 13029 ribosomal protein S25 


0.055 


334262 




CH22_FGENES.367_12 


0.055 


306590 


AI000246 


EST singleton (not In UniQene) wBh exon hit 


0X65 


331087 


R22520 


H&23398 ESTs 


0X65 


338620 




CH22_EMAC005500.GENSCAM.450-18 


0X66 


339045 




CH22_DA59H18.GENSCAN5fr5 


0.056 


308023 


AI452732 


EST singleton (not in UniQene) wffli exon hit 


0.057 
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339067 


CH22J)A59Hia.GENS<m33-3 


ossr 


335689 


CH22J=GENES.596 4 


0X57 


339069 


CH22J3A59H18.GENSCAN.3W 


0X57 


338176 


CH22JEMACO055OO.GENSCAH21W 


0.057 


328159 


Ca06Jsgi|5868065 


0.058 


335655 


CH22JGENES590J6 


0JJ58 


336371 


CH22_FGENES.820J 


6m 


336558 


CH22_FGENES.842_3 


0X59 


337738 


OC2_atACWX»97X3ENSCAN.10M 


om 


334273 


CH22_fGENE&369_2 


0X59 


335889 


CH22_FGENESj633 3 


om 


327807 


Cti05_hsgl]5867968 


om 


333315 


CH22_FQENES.138 7 


om 


338825 


CH22JXE46D7.GENSCAN/W 


ox 


337612 


CH22_C20H12£ENSCAN52-5 


0.06 


333897 


CH22_FGENES293 4 


0.06 


335990 


CH22_FGENES.655_4 


0.06 


334264 


CH22_FQENESJ67 15 


0.06 


338653 


CH22_EfAAC005500.GBJSCAN^39 


aoei 


322303 W07459 


EST cluster (not In UnlGene) 


0.061 


333498 


CH22_FGENES.168 8 


0X61 


336522 


CH22_FGB4ESJ39 3 


0.061 


301357 AW295677 


Hs.137840 ESTs; Moderately similar to HOMEOBOX 






PROTEIN SIX1 [H^apiens] 


0.062 


305917 AA876469 


Hs.181357 brrdnin receptor 1 (67kD; ribosomal protein SA) 


0X62 


336143 


CH22_FGENES.705 5 


0X63 


333433 


CH22J=GENES.168_2 


0X63 


332533 M99487 


Hs.1915 folate hydrolase (prostate-specific membrane antigen) 1 


0X63 


325644 


CR16JsgIj6552453 


0X63 


336402 


CH22_FGENESX23_17 


0X63 


335767 


CH22_FGENES.607_1 


0X64 


301893 T80334 


EST cluster (not in UniGene) with exon hit 


0X64 


324019 AW177009 


EST cluster (not in UniGene) 


0X64 


305801 AA845997 


EST singleton (not in UniGene) with exon hit 


0X64 


335188 


CH22_FGENES507 3 


0X65 


337533 


CH22_FGENES.828-2 


0.065 


333311 


CH22_fGENES.138_3 


0X65 


335668 


CH22_FGENESJ90_19 


0X65 


306786 AI041589 


EST singleton (not in UniGene) with exon ha 


0X66 


306365 AA962086 


EST singleton (not In UniGene) with exon hit 


0.068 


306249 AA933840 


EST singleton (not In UniGene) with exon hit 


0X66 


335018 


CH22_FGENES.474 6 


0.066 


333594 


CH22.FGENES210 3 


O.066 


333900 


CH22_FGENES293_7 


0.066 


325207 


CH.10_hsgi|6552430 


O.067 


329888 


CH.15_p2glj6067149 


0.067 


326238 


CH.17Jlsgi]5867260 


0.067 


333658 


CH22_fGENES241 4 


0.067 


335809 


CH22_FGENES.617_6 


0.068 


307427 AI243437 


EST singleton (not in UniGene) wiih exon hit 


0X68 


318428 AI949409 


Hs224583 ESTs 


0.069 


327005 


CH21_hsgq5867664 


0.069 


330463 HQ99S-HT998 


Sulfotransf erase, Phenol-Preferring 


0.069 


333318 


CH22_FGENES.138 10 


0.07 


333313 


CH22_FGENES.138 5 


0.07 


325937 


CH.16_hsglJ5867132 


0.07 


335663 


CH22.FGENES.590J4 


0.07 


335349 


CH22_FGENES539_2 


0X7 


303396 AA224470 


Hs.25426 ESTs; Weakly similar to unknown [Ksapiens] 


0.07 


332603 N66681 


HS33470 ESTs 


0.07 


333310 


CH22_FGENES.138_2 


0X71 


309924 AW340812 


EST singleton (not In UniGene) with exon hit 


0X71 


336340 


CH22_FGENES.814_15 


0.071 


308025 AM53365 


Hs.172928 collagen; type 1; atonal 


0X71 


306805 AI0S5966 


EST singleton (not in UniGene) with exon hit 


0X71 


335499 


CH22_FGENES571_8 


0.071 


329669 


CR14_p2gi|6272129 


0.071 


321666 D28390 


EST cluster (not In UniGene) 


0X71 


338174 


CH22JEMAC005500.GENSCAN219-2 


0X72 
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336558 CH22_FGENES.B42_1 0.072 

305451 AA738105 Hs.140 Immunoglobulin gamma 3 (Gmmarto) 0.072 

336684 CH22_FGENES.48-1 0.072 

, 328943 CH^1_hsgIJ6Q04446 0.073 

5 333347 CH22_FGENES.303_1 a074 

333214 CK22_FGENES.104_5 0.074 

331917 AA448572 Hs.174007 ESTs; Moderately stater to 110 ALU SUBFAMILY J WARNING 0.074 

339102 Cre2_DA59H18.GENSCAN.44-9 0X174 

328122 CH.06_hsgi|5S68031 0.075 

10 332250 N62712 H&226223 KIAA0618 gene product 0.075 

328506 CHj07_hsgil5868471 0X175 

331756 AA291468 Hs58504 ESTs 0.075 

335193 CH22_FGENES507 8 0.076 

317729 AA971718 Hs.128141 ESTs 0X178 

IS 304515 AA458703 H&251577 hemoglobin; alpha 2 0076 

313644 AI565768 Hs.1 24960 ESTs 0X176 

326145 CH.17_hsgi|5867204 0.076 

336394 CH22_FGENES.823 6 0.077 

■ 306516 AA989542 EST stngieton (not In UnlGane) with axon hit 0.077 

20 300629 AA1521 19 Hs.155101 ATPsynBiase;H+transpon1ng;ii*odwnd^ 

isofoiml; cardiac muscte 0X177 

333160 CH22_FGENES.91_2 0.077 . 

337490 CH22_FGENES.79M 0X177 ~ 

305403 AA723748 EST singleton (not In UniGene) with axon hit 0X177 

25 331747 AA281765 Hs.193689 ESTs 0.077 

332792 CH22_FGENES.3_2 0X178 

330513 M81057 Hs.1 80884 carboxypepOdaseBI (tissue) 0.078 

308905 A1859638 Hs£102 nbosomalprotehS20 0 078 

337419 CH22_R3ENES.7594 0.078 

30 333459 CH22.FGENES.157 8 0.078 

334851 CH22_FGENE&440 3 0.078 

329046 CHJLhsgi|5B68569 0.078 

327879 CROSJis 0^5868142 OXI79 

305830 AA857665 EST singleton (not in UniGene) with exen hit 0.079 

35 302928 AL137719 EST cluster (not in UniGene) with axon hit 0.079 

304321 AA136698 Hs.1 13029 ribosomal protein S25 0.079 

326390 CH.19_hsgl|5867340 0.079 

335230 CH22_FGENES.514_2 0.08 

334622 CH22_FGENES.412_6 0.08 

40 335331 CH22_FGENES-535_4 0X8 

304753 AA578840 Hs.77961 wajp: hlstocompa&iBy complex; class I; B 0X18 

301863 AI418863 EST cluster (not in UniGene) with exon hit 0X181 

336561 CH22_FGENES.842_6 0.081 

335811 CH22_FGENESi»3J5 0.081 

45 305060 AA635771 EST singleton (not In UniGene) with exon hit 0X181 

306051 AA905130 EST singleton (not in UniGene) with exon hit 0X182 

308289 AI571211 EST singleton (not in UniGene) with exon M 0.082 

334365 CH22_FGENES.378J3 0X182 

335496 CH22_FGENES.571_4 0.082 

50 332634 S38953 Human unidentified gene complementary to P450c2t 

gene; partial cds 0.082 

337824 CH22_EM^C005500.GENSCAN.13-18 0.082 

335822 CH22_FGENESXi19_7 - 0.082 

334758 CH22_FGENES.428_7 0082 

55 309641 AW194230 Hs253100 EST 0082 

333064 CH22_FGENES.75_7 0X183 

338695 CH22_EM^C005500.GENSCAN.477-25 0.083 

331809 AA402482 Hs57312 ESTs 0083 

326138 CH.17_hsglI5867203 0083 

60 328304 CH.07_hsgij6004478 0.083 

330570 U60276 Hs.165439 arsA (bacterial} arsanlts Irartsporten ATP-blnciing; homolog 1 0.083 

334305 CH22_FGENESJ73_8 " 0083 

335885 CH22_FGENESX02_3 0083 

,_ 325839 CK16_hsgi|6552452 0.083 

65 333531 CH22_F6ENES.175_18 0.084 

330385 AA449749 Hs.31386 ESTs; Highly similar to secreted apoptosis related protein 

1 pisapiens] 0.084 

323305 AA811351 Hs25307 Homo sapiens done 24812 mRNA sequence 0X184 

331698 Z39929 Hs.65843 ESTs 0X»4 
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335888 CH22_FGENES.633_2 OJ384 

306008 AA894390 EST singleton (not In UniGene) wffii exon hS 0X184 

334249 CH22_FGENES.365_15 0.084 

318303 AW451197 Hs.113418 ESTs 0.084 

5 330171 CR02_p2gil6648220 0.084 

335662 CH22_FGENES.41-1 0.085 

320506 A1815668 Hs.157476 sud -associated neurotrophic factof target 2 

(FGFR signalling adaptor) 0.085 

316974 AI740721 Hs.128292 .ESTs 0.08S 

10 336492 CH22_FGENES.832_9 0.085 

335750 CH22_FGENES.602_4 0.085 

335676 CH22_FGENES.594_1 0.086 

336093 CH22_FGENES.691_2 0.088 

310932 AI933861 H-222852 ESTs 0.086 

15 335160 CH22_FGENES.502_4 0.086 

334306 CH22_FGENES.373_9 0.086 

334793 CH22_FGENES.433_5 0.086 

333936 CH22_FGENES.301_2 0.087 

336413 CH22_FGENES.823_35 0X87 

20 333775 CH22_FQENES272_6 0.087 

335971 CH22_FGENES.652_4 0.087 

301737 AI815981 EST cluster (not In UniGene) with axon hit 0.087 

339101 CH22_DA59H18.GENSCAN.446 O087 ~ 

327612 Ca04_hsgi^525283 0.087 

25 326241 CH.17_hsgl|5867260 0.088 

338386 CH22_EM-AC0a5500.GENSCAN33M 0j088 

327762 Ca05Jsgi]5867961 0.088 

305266 AA679772 EST singleton (not In UniGene) wrth exon hit 0.088 

334359 CH22_FGENES.378_4 0.088 

30 335500 CH22_FGENES.571_10 0.088 

329687 CH.14_p2gl|61 17856 0.088 

333654 CH22_FQENES240_2 0.088 

324430 AA464018 EST cluster (not in UniGene) 0.088 

325999 CH.16_hsgI|5867073 0.089 

35 334832 CH22_FGENES.439_1 0XJ89 

339115 CH22_DA59H18.GENSCAN.49-3 0.089 

300896 AI916902 H&213882 ESTs 0.089 

328784 CK07Jtsgi|5868309 0.089 

335044 CH22 FGENES.480J 0.089 

40 329791 CH.14_p2gil6469354 a089 

333656 CH22_FQENES240_4 0.089 

326180 CH.17_hsgi|5867211 0.089 

333391 . CH22_FGENES.144_6 0.089 

338324 CH22_EM^C005500.GENSCAN30fr3 0.089 

45 305396 AA721052 EST singleton (not In UniGene) with exon hit 0.089 

337483 CH22_FGENES.795-7 0.09 

326424 CH.19Jisgi|5867369 0.09 

306454 AA977992 EST singleton (not in UniGene) with exon hit 0.09 

338893 OE2_DJ32l10.GENSCAN.7-6 0.09 

50 327470 CHj02_hsgil5867772 0.09 

333165 CH22_FGENES.91_7 0.09 

307155 AI186738 Hs.182426 rfbosomal protein S2 0.09 

330717 AA233926 Hs.23635 ESTs - 0.09 

335334 CH22_FGENES.535_10 0.09 

55 335907 CH22_FGENES.636_2 0.09 

333885 CH22_FGENES.292J 0.09 

331034 N51868 Hs31965 ESTs; Moderately similar to 40S R1BOSOMAL 

PROTEIN S20 [H^aplens] 0.09 

304660 AA534416 Hs.162185 ESTs 0.09 

60 328217 Ca06_hsgl]5868096 0.091 

336058 CH22_FGENES.684_13 a091 

302833 AA295381 Hs.44423 ESTs 0.091 

328668 Ca07_hsgi|5868254 0.091 

335309 CH22_FGENES.532_2 0.091 

65 338481 CH22_EM:AC005500.GENSCAN.377-5 0.091 

306286 AA936892 EST singleton (not to UniGene) with exon hit 0.091 

305070 AA639783 EST singleton (not In UniGene) with exon hit OX»1 

304870 AA594811 Hs.119122 ribosomal protein L13a 0.091 

303856 AA968589 Hs.944 glucose phosphate isamerase 0X191 
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323789 A1459812 H&170460 ESTs; Weakly similar to K1AAQ990 protein [Haptens] 0X92 

334910 CH22_FGENES.455_3 0X92 

326382 CH.19jBgip867327 0.092 

332467 AA48S630 Hs.1 19004 K1AA0665 gene product 0.092 

5 338534 CH22_EM:AC005500.GENSCAN.402-7 0X92 

336449 CH22_FGENES.829_e 0X92 

333709 CH22_FGENES.250_24 0092 

336559 CH22_FOENESX42_4 0X92 

333230 CH22_FGENES.107_10 0.093 

10 333133 CH22_.FGENES.83_9 0.093 

334885 CH22_FGENES.451_11 0X93 

330605 X02419 Hs.77274 plasminogen activator; urokinase 0X93 

336392 CH22_FGENESX23_4 0.093 

334083 CH22_FGENESX27J8 0X93 

15 325469 CH.12JB #017034 0.093 

331077 R09531 Hs.19039 ESTs 0.093 

303701 AW500732 EST dustar (not In UniGene) with exon hit 0X93 

334218 CH22_FGENESX58_3 0.093 

336542 CH22_FGBIESX40_6 0X93 

20 337151 CH22_FGENES546-1 0X93 

333642 CH22_FGENES.231_2 0X93 

336863 CH22_FGENES.297-4 0.093 . 

334680 CH22JHJENES.419J2 0.093 ~ 

326365 CH.18Jisglp887287 0X93 

25 338952 CH22JXI32I10.GENSCAN53-22 0X93 

337539 CH22_FGENESX32-4 0X94 

333546 CH22J=GENES.180J 0X94 

335258 CH22_FGEMES£18_3 0X94 

336786 CH22_FGENES.168-19 0.094 

30 321644 AI204177 Hs.237396 ESTs 0.094 

335943 CH22_FGENES.646_17 0.094 

327918 CHj06_hsgI|5868165 0X94 

306398 AA970548 EST singleton (not in UniGene) with exon hit 0X94 

335671 CH22_FGENES592_3 0X94 

35 335033 CH22_FGENES475_11 0X94 

338277 CH22_EM^C005500.GENSCAN290-2 0X94 

332061 AA504812 Hs.1 92824 early B-ceD factor 0.094 

305153 AA654582 Hs.77039 ribosomal protaln S3A 0.094 

333880 CH22_FGENES.282_2 0X94 

40 323940 AI864428 Hs.170880 ESTs 0X94 

313779 AA648796 Hs.129771 ESTs 0X95 

323109 AA169345 EST duster (not In UniGene) 0.095 

332930 CH22_FGENES.38_4 0X95 

335368 CH22_FGENES543_6 0.095 

45 303887 R72672 Hs.193484 ESTs; Weakly similar to Similarity w'rth yeast gene 

13502.1 [Celegans] 0.095 

336223 CH22JFGENES.727J3 0X95 

311280 AI767957 Hs.197737 ESTs; Wealdy similar to Y38A8.1 gane product [C.elegans] 0.095 

337256 CH22.FGENES.64fW 0.095 

50 308814 AI819263 EST singleton (not In UniGene) with oxon hit 0.095 

334659 CK22_FGENES.418_7 0X95 

335895 CH22J=GEMESX35_3 0.095 

321697 AW388061 Hs.4953 gplglautoanflgen;goIg!n subfamily a; 3 - 0.095 

336010 CH22J=GENES.668_8 0.096 

55 302824 U21260 EST cluster (not In UniGene) with exon hit 0.096 

333612 CH22_FGENES517_7 0.096 

304823 AA584837 EST singleton (not In UniGene) with exon hit 0.096 

335665 CH22_FGENES.590_16 0.096 

306518 AA989598 EST singleton (not In UniGene) with exon hit 0.096 

60 335243 CH22J=GENES.516_4 0.096 

335436 CH22_FGENES.559_5 0.096 

300243 AI420256 Hs.1 61271 ESTs 0.096 

332810 CH22_FGENES.7_12 0.097 

308612 AT735634 EST slngbton (not tn UniGene) with exon hit 0.097 

65 335818 CH22 FGENES.618_6 0.097 

325838 CH.16_hsgi|5552452 0X97 

337482 CH22_FGENES.795-6 0X97 

336645 CH22_FGENES.26-1 0X97 

337293 CH22_FGENESX75-1 0X98 
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15 
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25 
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35 
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45 



50 



55 



60 



65 



329893 


CH.15_p2 gip>525313 






CH.19_hs gip867441 


n ooo 


334905 


CH22_FGENtS.452_20 


A AflD 

Oabo 


306347 AA9S1144 


EST singleton (not in UniGene) with axon Wt 


0,098 


336676 


CH22_FGENES.43-4 


0,098 


339166 


CH22_DA59H18.GENSCAN.69-7 


0.098 


335774 


CH22_FGENES.6Q7_1 0 


0X98 


339216 


CH22_FF113D11.GENSGAN.6-11 


A AflQ 


335311 


CH22_FGENES.532_4 


0.098 


329832 


CH.11_D2gfl6729060 


0.098 


328595 


CH07J1S 01)5868224 


a noo 
0.030 


326928 


CH21Jhs ($6456782 


0X98 


315234 AKJ79680 


Hs.120770 ESTs 


0X98 


306082 AA908508 


EST singleton (not In UniGene) with axon hit 


0.098 


305710 AA826544 


EST singleton (not In UniGene) with exon hit 


0.098 


318540 T30280 


EST cluster (not In UniGene) 


0.099 


337553 


CH22JC4G1 .GENSCAN.2-1 


0X99 


320951 AA344069 


Ha2Q2699 neutexophin 4 


a noo 

u.uyy 


303845 T08033 


EST cluster (not In UniQene) with exon hB 


0.099 


338981 


CH22_DA59H18.GENSCAfi2-5 


0X99 


321313 R87365 


H&26058 ESTs; Weakly slmflar to pS3Z [H_aplens] 


0X99 


328348 


CH.07_hsgij5868383 


0X99 


332203 H49388 


Hs.102082 EST 


0X99 ~ 


301780 R07064 


EST cluster (not In UniGene) wiih exon hit 


0X99 


332095 AA608838 


IJ_. 4MM4 PfW 

Hs.1 62681 EST 


A AQO 

u.uyy 


333227 


CH22_FGENES.107_5 


0X99 


316442 AA760894 


Hs.153023 ESTs 


u.uyy 


326001 


CH.16_hsgi]5867073 


0.099 


334363 


CH22_FGENES .378_1 1 


0X99 


338895 


CH22_DJ3211 0.QENSCAN5-2 


0X99 


327460 


CHX2_hsgi]6004455 


0.099 


332705 T59161 


Hs.76293 thymosin; beta 10 


0.1 


307806 A1351739 


EST singleton (not In UniGene) wiih exon hit 


0.1 


322800 F25037 


Hs__5175 ESTs 


0.1 


304918 AA602697 


EST singleton (not in UniGene) with exon hit 


0.1 


334327 


CH22_FGENES.375_4 


0.1 


318359 AI097439 


Hs.135548 ESTs 


A A 
0.1 


326644 


CH20J1S gi|5867559 


A A 
0.1 


334454 


CH22_FGENES.388_3 


0.1 


327959 


CH.06_hsgi|5868210 


0.1 


323783 AA330586 


Hs.131819 ESTs 


0.1 


309198 AB55915 


Hs£48038 major MstocornpaMy complex; class l;C 


0.1 


339265 


CH22_BA354l12.GENSCAN.10-3 


0.1 


320576 AUJ49977 


Hs.1 62209 Homo sapiens rnRNA; cONADKFZp564C122 






#_ ■_._._. n(/n_peiA4nrA 

(from done DKF_p564C122) 


0.1 


338132 


Al MA ^— ■ 1.4 AAA^pAA AmAA A LI AAA A 

CH22_EM:AC005500.GENS(^.200_ 


0.1 


333163 


CH22_FGENES.91_5 


0.101 


337584 


CH22_C20H12.GENSCAN.5-1 


0.101 


307588 AI285535 


EST singleton (not In UniGene) with exon hit 


0.101 . 


336969 


CH22_FGENES.378-2 


0.101 


327535 


CH.02_hsgl|6525279 


0.101 


328732 


CH.07_hsgi|5868289 


0.101 


336686 


CH22_.FGENES46-3 


0.101 


335777 


CH2_l_rucN to.607_ l o 


a mi 


332944 


AiMA rAnieA _*r a 

CH2_LFGmES.47_3 


A 4A4 
0.101 


333174 




0.101 


336380 


CH22_FGENES_21_8 


o!ioi 


330571 U60800 


Hs.79089 sen_ domain; lmn_nog1obu_id(m_ln(lg); 






cytoplasmic domain; (semaphorin) 40 


0.101 


331789 AA398721 


Hs.1 86749 ESTs 


0.101 


338915 


CH22_DJ32I10.GENSCAN.12-1 


0.101 


334844 


CH22_FGENES.439J24 


0.101 


336642 


CH22 FGENES.23-4 


0.101 


334906 


CH22 FGENES.452J21 


0.101 


333188 


CH22_FGENES.98_8 


0.101 


300088 AW299993 


EST cluster (not in UniGene) wffli exon hit 


0.101 


329373 


CHJ<_hsgil6682537 


0.102 


331120 R46576 


Hs.23239 ESTs 


0.102 


335856 


CH22_FGENESX28_1 


0.102 
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331888 AA431337 Hs.98017 ESTs 0.102 

333154 CH22_FGENES.89_4 0.102 

335989 CH22_FGENES.655_2 0.102 

304385 AA235602 EST singleton (not tn UraGene) wBh exon hH 0.102 

5 338016 Cre2_EfAAC005500.GENSCAN.133-1 0.102 

335190 CH22_FGENES.507_5 0.102 

318595 T39486 Hs£137 ESTs 0.102 

333697 CH22_FGENES.250_11 0.102 

306526 AAS89713 EST singleton (not in UniGene) with exon hit 0.103 

10 328734 Ca07Jisgip868289 0.103 

307294 AI205612 Hs.73742 ribosomaJ protein; large; PO 0.103 

327424 CH.02_hsgi]5867751 0.103 

335872 CH22_FGENES.630_3 0.103 

333572 CH22J=6ENES.189_1 0.103 

15 334774 CH22J=QENESj430_6 0.103 

338660 CH22_EMACQ0550O.GENSCAN.462-1 ai03 

326713 CR20_hsgil5867595 0.103 

333994 CH22_FGENES310_18 0.103 

335800 CH22_FGENES.613_4 0.103 

20 318113 AI187943 Hs.132322 ESTs 0.103 

337278 CH22_FGENESj665-1 0.103 

336386 CH22_FGENES.822_e 0.103 . 

334790 CH22_FGENES.432_15 0.103 ~ 

303778 AW505368 EST duster (not In UniGene) with exon hit 0.104 

25 336524 CH22_FGENES.839J5 0.104 

328936 CH.08Jisgil5888500 0.104 

335102 CH22_FGENES.494_7 0.104 

300935 AA513644 H&222815 ESTs; Weakly similar to Wiskott-Aldrich Syndrome 

protein [Usapiens] 0.104 

30 307581 AI284415 EST singleton (not in UniGene) wSh exon hit 0.104 

317301 AW291683 Hs.226056 ESTs 0.104 

335330 CH22_FGENES.535_3 0.104 

337968 CH22_EMAC005500.GENSCAN.1 03-2 0.104 

335627 CH22_FGENES.584_7 0.104 

35 336274 CH22J=GENES.762_2 0.104 

334730 CH22J=GENES.424_5 0.105 

334409 CH22_FGENES.383_6 0.105 

327237 CR01_hsgil5867544 0.105 

333321 CH22_FGENES.138_13 0.105 

40 303181 AA452366 EST duster (not in UniGene) with exon hit 0.105 

333738 CH22_FGENES.261_2 0.105 

338255 CH22_ERlkAC005500.GENSGAN276-3 0.105 

334282 CH22_FGENES.369_12 0.105 

330190 Cti05j)2gi|6165182 0.105 

45 310748 AW014249 Hs.158698 ESTs 0.105 

338150 CH22_EMJ«a05500.GENSCAN507-2 0.105 

336719 CH22_FGENES.82-6 0.105 

330228 CR05_p2gij6013527 0.105 

327801 CH.05_hsgi]5857924 0.105 

50 330525 S75168 Hsj274 megattaryoqte-assodated tyrosine kinase 0.105 

334972 CH22_FGENES.468_2 0.105 

335111 CH22J=GENES.494J9 0.106 

334483 CH22_FGENES.395_5 - 0.106 

328829 CR07_hsgii58S8337 0.106 

55 302753 M74299 EST duster (not in UniGene) with exon hit 0.106 

334512 CH22_FGENES.398_10 0.106 

330024 CK16_p2giI6671908 0.106 

321030 AI769930 Hs.233617 Homo sapiens (done B3B3E13) Huntington's 

disease candidate region 0.107 

60 338410 CH22_ErVtAC00550aGENSCANJ41-6 0.107 

334353 CH22_FGENES.376_5 0.107 

338276 CH22_EM:AC00550aGENSCAN.288-9 0.107 

328053 CRX_hsgil586B574 0.107 

336560 CH22_FGENES.842_5 0.107 

65 332158 AA621363 Hs.1 12980 EST 0.107 

336447 CH22_FGENES.829_4 0.107 

333703 CH22_fGENES250_17 0.107 

326207 CH.17_hsgi|5867222 ai07 

333232 CH22_FGENES.108_1 0.107 
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334802 CH22_FG_NES J 435_1 0.107 

303784 AA704983 EST duster (not In UniGene) wlft exon hit 0.107 

338847 O122_DJ246D7.GENSCAN.10-2 0.107 

339407 (H22_OJ579N16.GENSCAN.1-9 0.108 

5 33763S CH22_C20H12.GENSCAN.32-8 0.108 

334650 CH22_FGENES.417_17 0.108 

308511 A1687580 ESTshglBton(notinUnlGane)wl9)axonhil ai08 

333332 CH22_FGENES.144_8 0.108 

325840 CH.16_hsgl)6552452 0.108 

10 315044 AW205664 H&129568 ESTs 0.108 

333298 CH22_FGENES.133_4 0.108 

335157 CH22_FGENES501 7 0.108 

333305 CH22_FGENES.137J2 0.108 

326379 CH.19_hsgl|5867327 0.108 

IS 335050 CH22_FGENES.432_1 0.108 

305185 AA663985 H&248038 major histocompatibiKy complex; dass I; C 0.108 

335658 CH22J=GENES590J 0.108 

323040 AA336609 Hs.10862 ESTs 0.108 

337326 CH22_FGENES.699-6 0.108 

20 333262 CH22_BA354l12.GENSCAN.9-6 0.108 

321202 H54052 Hs.163639 ESTs; Weakly simJar to INTERCELLULAR ADHESION 

MOLECULE-1 PRECURSOR [Haptens] 0.109 

331792 AA398968 H_97548 EST 0.109 ~ 

333806 CH22_FGENES278_2 0.109 

25 321325 AB033100 EST duster (not in UniGene) ai09 

331373 AA435513 Ha178170 ESTs; Weakly similar to DUAL SPECIFICITY 

PROTEIN PHOSPHATASE 3 0.87 

328775 CHj07_hsgl(5868309 0.109 

335105 CH22_FGENES.494 10 0.109 

30 300975 AI283548 Hs.149868 ESTs 0.109 

324893 T31940 EST duster (not In UnlGens) 0.109 

333397 CH22_F6ENES.144_15 0.109 

336434 CH22_FGENES.831_3 0.109 

335507 CH22_FGENES571_22 0.109 

35 336373 CH22_FGENES.820_3 ai09 

336188 CH22_FGENES.717_12 0.109 

313455 AW0B17O2 Hs.137329 ESTs 0.109 

335185 CH22_FGENES.508 4 0.109 

306814 AI066577 EST singleton (not In UniGene) wilh exon hit 0.109 

40 311130 AI632322 Hs.195306 ESTs 0.109 

310882 AW080339 Hs.211911 ESTs 0.109 

323383 AI346359 Hs.135209 ESTs 0.11 

300212 AW135925 Hs.184552 biphenylhydrolase-like (serine hydrolase; breast epOheEal 

mucbtassoc. 0.11 

45 325675 CH.14 hsgi|58B7014 0.11 

330095 CH.19_p2g!l6015278 0.11 

331942 AA453261 H-99309 ESTs 0.11 

334723 CH22_FGENES.421_34 0.11 

333614 CH22_FGENES217_9 0.11 

50 337316 CH22_FGENESj692-1 0.11 

305057 AA635628 Hs.62954 farriofi; heavy polypepfidel 0.11 

338704 CH22_EM:AC005500.GENSCAN.480-3 0.11 

335385 CH22_FGENES.543_27 - 0.11 

338012 CH22_EMAC005500.GENSCAN.128-10 0.11 

55 329449 CH.Y_hsgi|58688S8 0.11 

338980 CH22_DA59H18.GENSCAN.2-4 0.11 

336553 CH22_FGENES.841_10 0.111 

330021 CH.18_p2gS6671889 0.111 

327579 CH.03_hsgi|5867824 0.111 

60 333099 CH22_FGENES.79_4 0.111 

337076 CH22_FGENES.453-4 0.111 

331388 AA456852 Hs.43543 suppressor olwhte apricot homolog 2 0.111 

306674 AI005542 Hs.180414 heat shock 70M> protein 10{HSC71) 0.111 

305949 AA884409 EST singleton (not in UniGene) with exon hit 0.111 

65 330748 AA419217 Hs.15911 0KFZP586E1422 protein 0.111 

333780 CH22JFGENES.273_2 0.111 

323676 AI702835 EST duster (not In UniGene) 0.111 

308952 A18681S7 H_224226 EST 0.111 

309338 AW026946 Hs.181165 eukaiyotictraislafionelongafion factor 1 alpha 1 0.111 
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329317 CHJUisgll6381976 0.112 

333518 CH22_FGENES.173_3 0.112 

306382 AI127883 EST singleton (not h UrrtGetie) wffli exon hit 0.112 

336225 CH22_FGENES.728_2 0112 

5 333698 CH22_FGENES.256_12 0.112 

302173 AMI 7947 Hs.14068 ESTs 0.112 

335510 CH22_FGENES.571J5 W12 

328042 CHj06Jtsgi)59O2432 0.112 

338512 CH22_FGENES334_7 0.112 

10 328541 CH:07_hssil5888486 0.112 

311265 AW205118 Hs.199214 ESTs 0.112 

323218 AF131846 Hs.13396 Homo sapiens done 25028 mRNA sequence 0.112 

302002 AF013956 Hs.123085 chromobox homolog 4 (Drosophita Pc class) 0.112 

315088 AA557351 Hs.152449 ESTs; Moderately similar to MULTIFUNCTIONAL PROTEIN ADE2 0.112 

15 312581 AI937242 Hs.176590 ESTs 0.112 

322246 AW384710 Hs.125258 ESTs 0.112 

333659 CH22_FGENES.241J 0.113 

327510 CR02_hsgl|6117815 0.113 

336520 CH22_FGENES.839_1 0.113 

20 338682 CH22_EMAC00550a6ENSCAfi472-1 0.113 

334508 CH22_FGENES.398_6 0.113 

322533 T59538 EST cluster (not In UnlGerte) 0.113 

306873 AKJ86929 EST singleton (nottnUntQene) with exon hB 0.113 ~ 

336040 CH22_FGENES.679J 0.113 

25 303898 T23215 EST cluster (not In UniGeneJvnSi exon hit 0.113 

312011 AW294868 Hs.187226 ESTs 0.113 

335186 CH22_FGENES.508_5 0.113 

333607 CH22_FGENES.216_2 0.113 

305549 AA773530 EST singleton (not In UniQene) with exon hit 0.113 

30 333686 CH22_FGENES.249_4 0.113 

334352 CH22_FGENES.376_3 0.113 

338195 CH22_EMAC005500.GENXAN533-18 0.114 

333588 CH22_FGENES.206_2 0.114 

339233 CH22_BA354M2.GENSCAN.2-3 0.114 

35 337455 CH22_FGENES.777-1 0.114 

309101 AI925108 EST singleton (not in UniQene) with exon hit 0.114 

328522 CHJ7_hsg55868477 0.114 

323999 AI537333 H&252782 ESTs 0.114 

333517 CH22_rGENES.173j2 0.114 

40 329935 Cai6_p2g!l6165200 0.114 

326226 CR17Jisgip867230 0.114 

335890 CH22_R5ENES£33_4 0.114 

336715 CH22_FGENES.77-1 0.114 

327640 CR04_hsgi]5867890 0.114 

45 338842 CH22_OJ246D7.GENSCAN.7-1 0.114 

306534 AA991487 EST singleton (not in UniGene) with exon hit 0.114 

338597 CH22_FGENES.266_1 0.114 

321010 Y17456 Hsi27150 Homo sapiens LSFR2 gene; last exon 0.114 

302294 AA159213 Hs5337 isocflrate*hy(lroa^nase2(NADP+);irilto*ondrlal ai14 

50 324895 N44238 Hs.77515 inositol 1 ^triphosphate receptor, type 3 0114 

327358 CH.01_hsgil6552411 0.114 

303792 AI815153 Hs.195188 glyceraldehyde^phosphate dehydrogenase 0.115 

325886 CH.16_hsgi|5867087 - 0.115 

338850 CH22_FGENES.272-11 0.115 

55 305858 AAB63103 EST singleton (not in UniGene) wBh exon hit 0.115 

302569 AC0O4472 muffiple UniGene matches 0.115 

336158 CH22_rGENES.707_2 0.115 

327866 CR06_hsgi|5868131 0.115 

339157 CH22_OA59H18.GENSCAN.67-3 0.115 

60 339258 CH22_BA354l12.GENSCAN.8-3 0.115 

336129 CH22_FGENES.701J7 0.115 

333684 CH22_FGENES.249_2 0.115 

309618 AW190162 Hs.184776 ribosomal protein L23a 0.115 

312926 AA954097 Hs.127523 ESTs 0.115 

65 302640 AB035698 ESTduster (not In UniGene) with exon nit 0.115 

328968 CH08_hsgl|6456775 0.115 

327902 CHX»_hsgi]5868158 0.115 

321927 AJ223366 EST duster (not In UniGene) 0.115 

335982 CH22_FGENES.651_4 0.115 
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334927 CH22_FGENES.460_1 0.115 

330535 U11872 Human tnterieukM receptor type B (IL8RB) mRNA, 

GpDce variant HBRB1 0858 

328591 CR07Jisgp68227 0.115 

5 334902 CH22J=GBJES^52_16 0.115 

328525 915888482 0.115 

325870 CH.16_hsgfc682492 0.116 

337522 CH22J=GENES.819-1 0.116 

305079 AA641329 EST singleton (not In UniGene) with axon ha 0.116 

10 327343 CR01_hsgiJ6O17017 0.116 

333918 CH22_FGENES.296_7 0.116 

333600 CH22_FGENESi13_2 0.118 

335846 CH22_FGENES£23_6 0116 

333510 CH22_FGENES.171_4 tt116 

15 327629 CR04JB 9^5867872 ai16 

333470 CH22_FGENES.161_6 0.116 

326855 CR20_hsgl|6552460 0.116 

327008 CK21_ttsgi|5887664 0-117 

337480 CH22J=GENES.79M al17 

20 338425 CH22_FGENES.824_10 W17 

321964 AU079687 Hs.171065 ESTs 0.117 

335651 CH22_FGENES.590_2 ^ W17 

308164 A1521574 Hi181165 eutean/otic translation ekOTgatfanfeetor l alpha 1 0.117 

337927 CH22_EMAC005500.GENSCAN£0-3 0.117 

25 300341 H45095 Hs.153524 ESTs 0.117 

300154 AE45127 Hs.179331 ESTs 0.117 

306295 AA937331 EST singleton (not In UniGene) with axon hit 0.117 

329670 Cai4_p2gi|6272129 ai17 

335612 CH22J=GENESJ83_6 0.117 

30 307845 AI363450 EST singleton (not In UniGene) with axon hit 0.117 

330401 028383 Human mRNA for ATP synthase B chain, 5VTR (sequence from the 

5'cap to the start cotton) 0.117 

327127 CH.21_hsgl|6682520 0.117 

333843 CH22.fGEMES.290J 0.117 

35 331083 R17762 Hs22292 ESTs 0.117 

329140 CKXJisg?6017060 0.117 

339338 CH22J3A354I12.GOISCAN27-3 0.117 

331974 AA464518 Hs.99616 ESTs 0.117 

338631 CH22JEMAC005500.GENSCAN.454-2 0.117 

40 330299 CR06_p2gi|2905881 0.117 

330351 Ctt09_p2gi|3056622 0.117 

305377 AA715714 Hs.181357 lamlnln receptor 1 (67kD; ribosomal protein SA) 0.117 

333106 CH22_FGENES.79_12 0.117 

338514 CH22_E(AAC0O55Q0.GENSCAN392-4 0.117 

45 327335 CH.01JtsgiI5902477 0.117 

301970 AB028962 Hs.120245 WAA1039 protein 0.118 

326339 CH.17JlsgI]6056311 0.118 

330612 X15673 Hs53174 Human endogenous retrovirus pHE.1 (ERV9) 0.118 

334178 CH22_FGENES.350_6 0.118 

50 328008 Ca06Jisgil5902482 0.118 

329976 CR16_p2gi|4878063 0.118 

320952 AA897432 Hs.130411 ESTs 0.118 

305621 AA789095 EST singleton (not in UniGene) wffli axon hit - 0.118 

337850 CH22_ENUWX055TO.GENSCAN.34-3 0.118 

55 333626 CH22J=GENES224_2 0.118 

337672 Ctt22JEM4C000097.GENSCAN.67-1 0.118 

328803 CR07 hsgi|6004475 0.118 

325922 CH.16_hsgi|5867122 0.118 

334489 CH22_FGENESJ97_1 0.118 

60 320638 R54766 Hs.101120 ESTs 0.118 

321932 AA569229 EST cluster (not in UniGene) 0,118 

336958 CH22J=G£NES.357-1 0.118 

332082 AA600176 Hs.112345 ESTs 0.118 

306004 AA889992 EST singleton (not in UniGene) with exon hit 0.118 

65 338803 CH22J=GENES.194-1 0.118 

309107 A1925823 EST singleton (not in UniGene) with exon hB 0.118 

336859 CH22_FGENES593-9 0.118 

337935 . CH22_ErAACO05500.GENSCAN.8M 0.118 

326492 CH.19_hsgil5887422 0.118 
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327289 CROIJiS 0)5867481 0.119 

325818 CH.14Jis #682490 0.119 

310787 AW262580 Hs.159040 ESTs 0.119 

330028 CH.ie_p2gi)6871908 0.119 

5 325317 ' CH.11_hsgij5866878 0.119 

335279 CH22_FGENES£23J 0.119 

331720 AA192173 H&221530 ESTs 0.119 

329188 caX_hsgl)58S8711 0.119 

316012 AA764950 Hs.119898 ESTs 0.119 

10 338316 CH22_ElytAC005500.GENSCAN^04-2 0.119 

326033 CH.17_hsgI]5867178 0.119 

334745 CH22_FGENE&428_3 0.119 

333051 CH22J=GENES.73_5 0.119 

301783 R01279 EST duster (not hi UniGene) wffli axon hit 0.12 

IS 304502 AA454809 Hs.172928 coSagen; type fc. alpha 1 0.12 

335680 CH22_FQENESi94_5 0.12 

304678 AA548556 EST singteton (not In UniGene) with axon Wt 0.12 

335441 CH22J=GENES£60_4 0.12 

336187 CrC2_FGENES.717_.11 0.12 

20 309422 AW087175 EST singteton (not in UniGene) with axon hit 0.12 

336047 CH22_FGENES.679_9 0.12 

309851 AW195850 EST singteton (not In UniGene) vsfSh axon hit 0.12 . 

308547 AI695385 H&201903 EST 0.12 

304443 AA399444 EST singleton (not in UniGene) with exon hit 0.12 

25 336245 CH22_FGENES.746_3 0.12 

302703 H72333 EST cluster (not In UniGene) with exon hit 0.12 

335690 CK22_FGENES.536_5 0.12 

328941 CU08_hsgij6456765 0.12 

333873 CH22_FGENES.291_9 0.12 

30 317246 AW105092 Hs.155690 ESTs 0.12 

339288 CH22_BA354l12.GENSCAN.16-6 0.12 

337996 CH22_EMAC005500.GENSCAN.1 16-3 0.12 

333304 CH22_FGENES.137_1 0.121 

308332 A1591235 EST singteton (not In UniGene) with exon hit 0.121 

35 329319 CRX_hsgi]6331976 0.121 

302086 X57138 multiple UniGene matches 0.121 

333290 CH22_FGENES.129_2 0.121 

323825 AI793080 Hs.123525 ESTs; Weakly similar to NEUTROPHIL GELATINASE-ASSOOATED 

UPOCAUN PRECURSOR [Rnorvegicus] 0.121 

40 . 330575 US4105 Hs.252280 Rho guanine nudeofide exchange factor (GEF) 1 0.121 

305274 AA679990 Hs.181 165 eukaryotic translation elongation factor 1 afchal 0.121 

333647 CH22_FGENES.235_2 0.121 

302251 AA333340 EST cluster (not in UniGene) with exon hit 0.121 

329777 CH.14_p2 gl]6002090 M21 

45 333155 CH22_FGENES.89_5 W21 

326122 CH.17Jisgij5B67194 0.121 

335310 CH22_FGENES.S32_3 0.121 

335453 CH22.R3ENESJ62J3 ai22 

305103 AA643329 Hs.111334 ferritin; fight polypeptide 0.122 

50 337284 CH22 FGENES.667-2 0.122 

337418 CH22J=GENES.758-4 0.122 

313073 AI963740 Hs.46826 ESTs 0.122 

303759 AW504164 EST cluster (not In UniGene) with exon hit - 0.122 

300017 

55 M33197 AFFX control: GAPDH 0.122 

316725 AW135084 Hs.127264 ESTs ai22 

330738 AA293153 Hs.120980 nuclear receptor ec-repressor 2 0.122 

338466 CH22_FGENES.829J5 0.122 

335956 CH22_FGENES.647_3 0.122 

60 315308 AA780564 Hs.189053 ESTs 0.122 

338925 CH22_DJ32110.GENSCAN.14-3 0.122 

334969 CH22_FGENES.466_2 0.122 

322050 AL137589 EST cluster (not in UniGene) 0.122 

339084 CH22_DA59H18.GENSCANJ8-2 0.122 

65 338323 CH22_EMAC005500.GENSCANJ06-2 0.122 

337003 CH22_FGENES.419-7 0.122 

325470 CH.12_hsflP17034 0.123 

336503 CH22 FGENES-833J0 0.123 

330786 060374 Hs_»58712 EST 0.123 
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329446 
303326 

309067 
317464 
328755 
326038 
327208 
326124 
327509 
338398 
304652 

335797 
336714 
327204 
331681 
306971 
336174 
336126 
329129 
303049 
335778 
336601 
334340 



AA229433 H&222634 

AI916313 H*212788 
AA968472 Hs.130463 



AA527782 H&84298 



AA430672 Hs.123778 
A1126509 



306013 
339213 
335355 
336552 
336384 
310485 
335840 
336444 
315703 
327763 
336383 



AW407562 



AA8969S0 



AK86202 Hs.1 49800 
N36070 



328662 
338988 
328311 
337241 



313483 
326116 
330450 
307491 
331852 

330462 
304410 



336793 
326243 
327266 
320753 
336960 
329667 
328168 
336534 



309230 
339190 
337086 
319233 
339396 
331930 



AW294432 Hs.144252 

HG363-HT363 
AI268539 

AM18988 H&98314 

KQ944-HT944 
AA284508 



AP070579 Hs.181544 



AI970747 

R21054 Hs.211522 
AA449077 Hs.179765 



308099 AI475914 



CH.Y_hsgl|5868886 0.123 
EST s; Moderately slmlar to ubkpiifiivCka protein / 

ribosomalprotobS30 0.123 

EST 0.123 

ESTs ' 0.123 

Ca07jisgi]5868301 0.123 

CH.17_hsgl|5867178 - 0.123 

Ca01Jisgi|5867447 0.123 

CH.17JB g!|5916395 0.123 

CH.02_hsgl|61 17815 0.123 

CH22_EM-J\C00S500.GENSCA!i336-5 0.123 
C074 antigen (invariant polypaptlda of major 

histocompafflaay complex; class II arrtigen-assodated) 0.123 

CH22_FGENES.612_6 0.1.24 

CH22J=GENES.76-29 0.124 

CH.01_hsgl|5867447 0.124 

ESTs 0.124 

EST singleton (not In UniGana) with axon hit 0.124 

CH22_FGENES.710_1 0.124 

CH22_FGENES.701_13 0.124 

CHX_hsgq65B8026 0.124 

EST duster (not In UniGene) w9h exon h& 0.124 

CH22_FGENES.607_14 0.124 ' 

CH22J : GEKES569_2 0.124 

CH22_FGENES.375J7 0.124 

CH22_FGENES.767-1 0.124 

EStsingl8ton(rx)tInUnlGene)wilhexonhit 0.124 

CH22_FF113D11.GENSCAN.6fl 0.124 

CH22.FGENES541J 0.124 

CH22.FGENES.841J 0.124 

CH22_FGENES.822_4 0.124 

ESTs 0.125 

CH22_FGENES£22J3 0.125 

CH22_FGENES.827J0 0.125 

EST cluster (not In UniGene) 0.125 

CH.05JwgiJ5867961 0.125 

CH22_FGENES.822_3 0.125 

CH22 FGENES.168J 0.125 

CR07_hsgi]6004473 0.125 

CK22_DA59H18.GENSCAri5-1 0.125 

CHj07_hsgll5866371 0.125 

CH22_FGENES£44-2 0.125 

CH22_FGENES.350-7 0.125 

ESTs 0.125 

CH.17Jhsgi|5867193 0.125 

Epidermal Growth Factor Receptor-Related Protein 0.125 

EST singlaton (not In UniGene) with axon hit 0.125 
Homo sapiens mRNA; eONA OKFZp5831j0120 

(trom done OKFZp586L0120) 0.125 

Dopandna Receptor D4 0.125 

EST singlaton (not In UnlGena) with exon hit 0.125 

CH22_FGENES.822_5 0.125 

CH22_FGENES.176-3 - 0.125 

CH.17_hsgi]5867261 0.125 

CHX)1_hsgll5B67462 0.125 

Homo sapiens dona 24487 mRNA sequence 0.125 

CH22_FGENES.369-5 0.125 

CH.14_p2gi|6272129 0.125 

CR06_hsglj5868071 0.125 

CH22_FGENES339_I6 0.125 

CH22_BA354I12.GENSCAN.1M 0.126 

EST singleton (not in UniGene) with exon hit 0.126 

CH22_FF113D11.GENSCAN.1-2 0.126 

CH22_FGENES.45B-14 0.126 

ESTs 0.126 

CH22_BA232E17.GENSCAN.64 0.126 
Homo sapiens mRNA; cDNA OKFZp586H1921 

(from done DKFZp586H192 0.126 

EST singleton (not In UniGene) with exon hit 0.126 



241 



WO 02/30268 PCT/US01/32045 



338477 CH22_BAAC005500.GENSCAH37*5 ai26 

334286 CH22_FGENE5.369_16 0.126 

317245 AI025039 Hs.131732 ESTs 0.126 

335249 CH22_FGENES.516 10 0.126 

5 333327 CH22_FGENES.133_20 0.126 

304240 AA00S802 EST singleton (not in UniQene) wilh exon htl 0.126 

335464 CH22_FGENES.562_26 0.126 

335236 CH22_FGENES515_8 0.126 

334154 CH22_FGENES.340 4 0.126 

10 309257 AI934183 EST singleton (not in UnlGene) wBh axon hit 0.126 

310015 AI220122 H&201981 EST s; Weakly similar to breast carcinoma-associated antigen 

[HLsapiens] 0.126 

328280 CH£7_hsgij5868352 0.126 

305744 AA831819 EST singleton (not In UnlGene) w&h exon hit ai26 

IS 327430 CHj02_hsgIp887754 0.126 

328323 CHj07_hsgl|5868373 W26 

333274 CH22_FGENES.123_2 0.126 

337193 CH22_FGENES.575-3 0.127 

334820 CH22_FGENES.437_2 0.127 

20 328706 CHj07Jisgil5668270 0.127 

331228 W67267 Hs.174911 ESTs 0.127 

307205 AI192479 EST singleton (not tn UnlGene) wflh axon hit 0.127 

337123 CH22J=GENES.519-3 0.127 

326201 CR17Jlsgi)5867216 0.127 

25 335276 CH22_FGENES.523_2 0.127 

331202 T81115 Hs.191136 ESTs 0.127 

330532 U03187 Hs.121544 kiterteukto 12 receptor; beta 1 0.127 

321235 N49521 EST duster (not in UniQene) 0.127 

301743 F12605 Hs204529 ESTs; Weakly similar to reverse transcriptase [H .sapiens] ai27 

30 328175 CH.06_hsgI15668073 0.127 

306407 AA971985 EST singleton (not In UniGene) with exon hit 0.127 

327145 CHX)1_hsgi]5867548 0.127 

327649 CH.04_hsg!]5B67899 0.127 

335142 CH22_FGENES.498_12 0.127 

35 333909 ' CH22J=GENES.295_2 0.127 

330608 X04325 H&2679 gap junction protein; beta 1 ; 32kD {connexln 32; 

Charoot-Marie-Tooth neuropathy; X-Cnked) 0.127 

330158 CH.21_p2gi|6580367 0.127 

320153 AF064594 Hs.120360 phosphoItoaseA2; group VI 0.127 

40 314407 AA098835 Hs224432 ESTs 0.127 

333383 CH22_FGENES.143_22 0.127 

320663 AI734242 Hs£44473 ESTs 0.128 

326233 CH.17_hsgi|5867232 0.128 

326598 CH20Jlsgi|5867634 0.128 

45 335174 CH22_FGENES504_4 0.128 

319843 H29920 Hs.99486 ESTs; Weakly similar to aralarl [H.sapisns] 0.128 

335458 CH22J=GENES.562_18 0.128 

332997 CH22_FGENES58_4 0.128 

334188 CH22_FGENES.352_3 0.128 

50 329759 CR14_p2gi|6048280 0.128 

330348 CR09_p2gi|4544475 0.128 

326958 CH21_hsgi|6469836 0.128 

305263 AA679467 EST singleton (not In UniQene) with exon hft - 0.128 

337693 CH22 EM-AC000097.GENSCAN.78-14 0.128 

55 326812 CH20_hsgi|6682504 0.128 

333237 CH22_FGENES.108 7 0.128 

333699 CH22_FGENES250_13 0.128 

311496 AI768677 Hs209888 ESTs; Weakly similar to phosphatidylsarine 

synlhase-2 [M/nusculusj 0.128 

60 336499 CH22_FGENES.833_4 0.128 

320087 AF032387 Hs.1 13265 small nuclear RNA activating complex; polypeptide 4; 1 9010 0.128 

309989 AI184186 Hs.197813 ESTs 0.128 

301490 AW298468 H&250461 ESTs 0.128 

337011 CH22.FGENESX27-6 0.128 

65 315052 AA876910 Hs.134427 ESTs 0.128 

301611 W22172 HsJ9038 ESTs 0.128 

336497 CH22_FGENES.833_2 0.129 

302068 Y16280 Hs.132049 endotherin type b receptor-fflce protein 2 0.129 

334502 CH22_FGENESv397_18 0.129 
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304332 AA158884 EST singleton (not in UniGene) wSh axon hH ai29 

304522 AA46540S EST singleton (not in UniGene) with exon ha 0.129 

312407 R46180 Hs.153485 ESTs 0.129 

310093 AI685841 Hs.161354 ESTs 0.129 

5 301119 AF142579 EST duster (not in UniGene) with exon hit 0.129 

309268 AIS85621 Hs£2854 feirlOn; heavy potypepSdel ai29 

330989 H42142 HSJ226396 DEAD/H (Asp^AWtep/Hfe) box polypeptide 19 

(DbpS; yeast horaotag) ai29 

336949 CH22_FGENES^81-4 0.129 

10 330115 CH.19_p2g!|6015202 0.129 

339212 CH22JFF113D11.GENSCAN.6-7 0.129 

326951 CR21_hsgil6004448 0.129 

305165 AAS62939 EST singleton (not in UniGene) wilh axon hit 0.129 

308238 A1559492 EST singleton (not In UniGene) with axon hit 0.129 

IS 337140 CH22JFGENES.537-5 ai3 

321758 U29112 EST duster (not in UniGene) 0.13 

304619 AA515554 Hs.119598 ribosomal protein L3 0.13 

312469 AA7452B9 Hs.173088 ESTs 0.13 

339017 CH22_DA59H18.6ENSCAN2(W 0.13 

20 330116 Cai9_p2gil6015202 0.13 

333312 • CH22_FGENES.138_4 0.13 

338004 CH22_ENtAC005500.GENSCAN.121-1 ai3 . 

314141 AA232134 Hs.190028 ESTs 0.13 

300509 AI239845 Hs.128494 ESTs; Weakly sMar to EG.-8587.2 p jnelanogaster] 0.13 

25 338530 CH22_EKWC005500.GENSCAN598-11 0.13 

335968 CH22_FGENES.65ZJ 0.13 

314121 AI732100 Hs.187619 ESTs 0.13 

337593 CH22_C20H12.GENSCAN.6-8 0.13 

332881 CH22_FGENESJ3_1 0.13 

30 305836 AA858043 EST singleton (not in UniGene) with axon hit, 0.13 

339059 CH22_DA59H18.GENSCAN.30-5 0.13 

305610 AA782319 EST singleton (not in UniGene) with axon hit 0.13 

305852 AA862455 EST singleton (not in UniGene) with exon hit 0.13 

327409 CH.02_hsgl|5867750 0.13 

35 312751 A1613089 Hs.164178 ESTs 0.13 

308726 A1799268 Hs.209929 EST 113 

325961 CH.16_hsglj5887147 0.13 

311159 AW025919 Hs.197636 ESTs 0.13 

322715 AA057230 Hs.182135 ESTs 0.13 

40 336441 CH22_FGENESJ27_7 W3 

336339 CH22_FGENES.814_12 0.13 

306911 AI095365 EST singleton (not in UniGene) with exon hit 0.13 

333613 CH22_FGENES.217_B ai3 

338489 CH22_EM^C005500X5ENSCANJ84-17 0.131 

45 326904 CH21_hsgi]5867684 0.131 

337337 CH22_FGBJES.717-1 0.131 

326752 CR20_hsgi|5867615 0.131 

303977 AW512978 EST singleton (not In UniGene) with exon hit 0.131 

301373 AA595235 EST duster (not in UniGene) with exon hit 0.131 

50 338448 CH22_ENtAC005500.GENSCANJ59-22 0.131 

333774 CH22_FGENES.272J 0.131 

332986 CH22_FGENES54_8 0.131 

335362 CH22_FGENESJ41_12 - 0.131 

335896 CH22_FGENES.635_4 0.131 

55 33782S CH22_EMJ«M05500.GENSCAN.13-19 0.131 

325257 CH.11_hsg!|5866895 0.131 

331188 T50240 Hs.167837 ESTs 0.131 

330645 Y083O2 Hs.144879 dual specificity phosphatase 9 0.131 

331760 AA292721 Hs.154434 ESTs; Weakly similar to unknown [Hsaplens] 0.131 

60 322995 AA513829 Hs.29797 ribosomal protein L10 0.131 

335497 CH22_FGENES571_5 0.131 

334824 CH22_FGENES.437_6 0.131 

319480 R06933 Hs.184221 ESTs 0.131 

334842 CH22_FGENES.439_21 0.131 

65 333335 CH22_FGENES.139_4 0.131 

317252 AA905178 Hs.130124 ESTs 0.131 

329034 CHJU)sgii5868561 0.131 

305186 AA664230 EST singleton (not in UniGene) wilh exon hit 0.131 

335755 CH22_FGENES.6W_4 0.131 
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302143 H1SZ70 Hs.189847 putafive neuronal cell adhesion raotecute 0.131 

334939 CH23JFGENE&46S 3 0.131 

318994 C15110 Hs.17802 ESTs 0.131 

334498 CH22_F_ENES.397_14 0.131 

5 333413 CH22_FGENES.146J 0.132 

329876 CR14_p2gi|a272128 0.132 

327277 CR01_hsgH5867473 0.132 

305022 AA627416 EST singleton (not In UniGene) wffli exon hit 0.132 

336805 CH22_FGENES.19W 0.132 

10 320121 T93857 EST cluster (not In UnlGene) 0.132 

334761 CH22_FGENES.428_10 0.132 

339400 CH22_BA232E17.GENSCAN.7-6 W32 

330301 CR06_p2gl|2905B62 0.132 

316822 AA827691 Hs.129967 ESTs; Weakly simflar to neuronal foread protein 

IS A07c4TrP(asaplens] ai32 

328020 CR06_hsgil5902482 0.132 

325327 CH.11_hsgi]5866875 0.132 

321163 AA209530 EST duster (not In UnlGene) 0.132 

336393 CH22 J=GENES.823_6 0.132 

20 325905 Cai8_hsgl|5867104 0.132 

305237 AA676286 H&2186 eukaryotio translafion elongation factor 1 gamma 0.132 

339046 CH22_DA59H18.GENSCAN.28-6 0.132 

325375 CH.12_hsgp66920 0.132 ~ 

333961 CH22_R3ENES_B4_7 0.132 

25 335450 CH22_FGENES562_8 0.133 

302286 R58438 EST cluster (not In UniGene) with exon hit 0.133 

335116 CH22_FGEN_S.498 3 0.133 

327333 CH.01Jisgi|5902477 0.133 

308070 AI470948 EST singleton (not h UniGene) wffli exon hB 0.133 

30 308311 AI581855 EST singleton (not In UnlGene) wffli exon hit 0.133 

320813 AW360847 Hs208839 ESTs 0.133 

323865 AW248307 EST cluster (not In UnlGene) 0.133 

328318 Ca07_hsg^5868373 0.133 

320603 R51419 EST cluster (not In UnlGene) 0.133 

35 332791 CH22.FGENES.3J 0.133 

314976 AA524725 Hs.162108 ESTs 0.133 

303309 AL134164 HS224868 ESTs 0.133 

320581 R39753 Hs.170187 ESTs ai33 

333944 CH22_R3ENES_K__2 0.133 

40 317992 AI733512 Hs.130901 ESTs 0.133 

330935 F02383 Hs.26492 beta-l^glucuronyttransferase 3 (glucuronosyttransferasa I) 0.133 

336659 CH22LFGENES.36-5 0.133 

338887 CH22JXJ32l10.GENSCAN.e-10 0.133 

305273 AA679979 Hs.181165 eukaryotic translation elongation factor 1 alpha 1 0.133 

45 333566 CH_2_FGENES.183_2 0.134 

316952 AW450033 Hs.163312 ESTs 0.134 

333818 CH_2_FGEN_S_S3_1 0.134 

328687 CH.07_hsgij5868262 0.134 

302879 H11802 EST cluster (not In UnlGene) with exon Wt 0.134 

50 336557 CH2__FGENESJ42_2 0.134 

335222 CH22_FGENES.513_5 0.134 

338094 CH22__rvkAC005500.GENSCAN.179. 0.134 

337384 CH22_FGENES.745-1 - 0.134 

327360 Ca01_hsgl]6552411 0.134 

55 328132 CHXI6_hsgi|5868038 0.134 

323604 AI751438 Hs.182827 ESTs; Weakly similar to !ID ALU SUBFAMILY SQ 

WARNING ENTRY 111! 0.134 

337591 CH22_C20H12.GENSCAN.0O 0.134 

307018 AI140639 EST singleton (not tn UniGene) with exon hit 0.134 

60 326896 CH.21_hsgij5867680 0.134 

333479 CH22_FGENES.163_5 0.134 

337915 CH22_EMAC005500.GENSCAN„1-3 0.134 

335110 CH22_FGENES.494_18 0.134 

333481 CH22_FGENES.163_9 0.134 

65 327512 CH.02_hsgl|6117815 0.134 

300096 AW328639 Hs.83575 ESTs; Weakly similar to ZC328.3 [C.elagans] 0.134 

330163 Ca02_p2gi|6042042 0.135 

335752 CH22 FGENES£04_1 0.135 

334857 CH22_FGENES.443_1 0.135 
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301872 H84730 EST ctostar(«rt in UriGerio) with exon M 0.135 

337529 OC2_FGENE&823^9 0.135 

335734 CH22_FGENES.601_4 0.135 

337551 CH22_FGENES£47-8 0.135 

5 309078 AI92096S H&77961 major MstaoompaGbility complex; class I; B 0.135 

335513 CH22_FQBC&571_28 0.135 

339078 CH22_DAS9H18.GENSCAN.37-6 0.135 

321907 N56660 Hs.148722 ESTs; Weakly similar to large tumor suppressor 1 [H^aptens] 0.135 

337189 CH22_FGENES£71-32 0.135 

10 329635 CH12_p2gIJ5302317 0.135 

308601 AI719930 EST slngfeton (not in UnlGane) with exon hit 0.135 

305020 AA627248 Ha2064 vtmantiri 0.135 

333894 CK22J=GENES283_1 0.135 

322465 AA137152 Hs3784 ESTs; Highly sfrnllar to phosphoserine aminotransferase 

IS [Usapbns] 0.135 

305601 AA780975 EST singleton (not In UniGena) with axon hit 0.135 

332186 H10781 Hs.141051 ESTs; Moderately similar to B!l ALU SUBFAMILY SB 

WARNING ENTRY 0.135 

327822 CR05Jlsgi|5867968 0.135 

20 310087 AI393914 Hs.160624 ESTs;Weaiaysirrtotoshii_rtoCR16;SH3(Jornain 

binding protein 0.135 

328752 Ca07jisgil5868298 0.135 . 

337611 (5H22_C20H12.GENSCAN.19-4 0.135 ~ 

334470 CH22_FGENES.394_1 0.136 

25 335115 CH22_FQENESj<96_2 0.136 

328730 CR07_hsgi|5868289 0.136 

330350 CH.09_p2gt|3056S22 0.136 

336971 CH22_FGENES.378-6 0.136 

308258 A1565612 EST singleton (not In UniGane) with exon hit 0.136 

30 326745 CR20_hsgi|5867S11 0.136 

335440 CH22_FGENES£60_3 0.136 

320257 AA33074S EST cluster (not in UniGene) 0.136 

328677 CH.07_hsgil5868256 0.136 

329731 CH.14_p2gt|6065783 0.136 

35 315950 AA700553 H&206974 ESTs 0.136 

330049 CH17_p2giI4567182 0.136 

337070 CH22_FGENES.448-3 0.136 

304095 H1 1324 HSJ1059 EST 0.136 

309304 AW005527 H&232820 EST 0.136 

40 333458 CH22J=GENES.157_7 0.136 

329899 CR15_p2git6563505 0.136 

322202 AE75056 HS200133 ESTs 0.136 

333991 CH22J=GENES.310_15 0.136 

318617 AW247252 Hs.75514 nucleoside phosphorylase 0.136 

45 310623 A13415B6 Hs.195588 ESTs 0.136 

330489 M23323 Hs3003 CD3E antigen; epsilon polypeptide (TiT3 complex) 0.136 

309646 AW194694 EST singleton (not In UniGena) with exon hit 0.136 

331068 R00071 Hs.191199 ESTs 0.136 

334285 CH22_FGENESJ69_15 0.136 

50 332178 F13689 Hs.100725 EST 0.136 

305724 AA827608 EST singleton (not ki UniGene) with exon hit 0.136 

303158 AL138110 Hs.8594 Homo sapiens mRNA containing (CAG)4 repdat; done CZ-CAG-7 0.136 

334543 CH22_FGENES.403_8 - 0.136 

335384 CH22jFGENES.543_26 0.136 

55 336527 CH22J=GENES.839JB 0.136 

334951 CH22J=GENES.465_20 0.136 

325882 CH.16_hsgil5867087 0.137 

305134 AA653159 EST singleton (not In UniGene) with exon hit 0.137 

307058 AI148709 EST singleton (not in UniGene) with exon hit 0.137 

60 331943 AA453418 Hs.178272 ESTs 0.137 

331116 R44780 HS22634 ESTs 0.137 

306094 AA908877 EST singleton (not to UniGene) with exon hit 0.137 

333561 CH22_FGENES.180_18 0.137 

321439 H61962 EST cluster (not in UniGene) 0.137 

65 324594 AA497090 EST cluster (not In UniGene) 0.137 

337926 CH22 EMAC005500.GENSCAN.77-4 0.137 

337353 CH22_fGENES.726-1 0.137 

331836 AA412295 Hs.104774 EST 0.137 

308981 AI873242 EST singleton (not in UniGene) with exon hit 0.137 
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329424 CftYJlsgi|5868879 0.137 

325829 CH.15_hSBi)5867052 0.137 

331845 AM16883 H&88183 ESTs 0.137 

333SS4 CH22_FGENES290_13 0.137 

5 306591 AI000248 EST singleton {not In UnJGene)wtthexon(ifl ai37 

328948 CH.08_hsgl|8456765 0.137 

338935 CH22_DJ32I10.QENSCAH18-12 0.137 

325960 CH.16Jsg!j5B87147 0.137 

328377 CH.07Jisgtj5B68390 0.138 

10 308851 AB29820 EST singleton (not in UniGene) wflh exon hit 0.138 

314820 AA4243S2 H&210586 ESTs 0.138 

337592 CH22_C20H12.GENSCAN.6-7 0.138 

338684 CH22_BAAC00550aGENSCAN472-3 0.138 

331800 AA400438 H&97543 ESTs 0.138 

IS 304587 AA505535 EST singleton (rwt in UniGene) wftti exon hit 0.138 

333981 CH22_FGENES.310 4 0.138 

332452 AAO40369 Hs.11170 SYT Interacting protein 0.138 

305752 AA835278 EST singleton (not In UniGene) with exon hit 0.138 

311947 T65554 HSJ251591 EST 0.138 

20 333783 CH22_FGENES273_6 ai38 

337408 CH22J=GENES.754-14 0.138 

327976 CH.06Jisgi|5868212 ai38 

325593 CH.13_hsgi)586S992 0.138 ~ 

339425 CH22_DJ579N16.GENSCAN.144 0.138 

25 304475 AA428879 EST singleton (not In UniGene) with exon hit 0.138 

309488 AW131104 EST singleton (not In UniGene) with exon hit 0.138 

337532 CH22_FGENES.827-6 0.138 

317234 AA904448 Hs.126368 ESTs 0.138 

312261 AA854425 Hs.144455 ESTs 0.138 

30 328927 CH.08_hsgi]5868500 0.138 

336424 CH22_FGENES.824_9 0.138 

326667 CHi0_hsg!I6552455 0.138 

325988 CH.16_hsgij5867084 0.138 

318446 AW300287 EST cluster (not in UniGene) 0.139 

35 335511 CH22_FGENES.834_6 0.139 

335204 CH22_FGENES508_13 tt139 

303244 M147472 EST cluster (not in UniGene) with exon hit 0.139 

330870 AA115804 Hs.187593 ESTs ai39 

329376 CHJC_hsgi|5868859 0.139 

40 304703 AA563898 EST singleton (not in UniGene) with exon hit 0.139 

333653 CH22_FGENES.239J2 0.139 

306799 AI051S96 EST singleton (not in UniGene) with exon hit 0.139 

304872 AA595289 EST singleton (not in UniGene) with exon hit 0.139 

330812 AA013001 Hs.60563 ESTs 0.139 

45 329568 CH.10_p2gP62490 0.139 

319210 AA253074 Hs.146261 ESTs 0.139 

334320 CH22_FGENES.374_5 0.139 

300880 AI916949 Hs.149748 ESTs; Weakly simBar to weak similarity to coBagens[C.eiegans] 0.139 

305866 AA864533 EST singleton (not In UniGene) with exon hit 0.139 

50 312943 AA984364 Hs.119064 ESTs 0.139 

330523 M99439 Hs.83958 tninsdudiriike enhancer of split 4; homotog of DrosopMa E(sp1) 0.139 

312708 AI076204 Hs.135440 ESTs 0.139 

309366 AW072970 EST singleton (not In UniGene) with exon hit - 0.139 

303273 AA316069 EST cluster (not in UniGene) will exon hit ai39 

55 317484 AW274696 Hs.143921 ESTs 0.139 

333239 CH22J=GENES.111_1 0.139 

307126 AI184951 EST singleton (not In UniGene) with exon hit 0.139 

316813 AAB26505 Hs.124517 ESTs 0.139 

331746 AA281365 Hs.121640 ESTs; Weakly similar to K1AA0336 [Haptens] 0.139 

60 308558 AI700145 Hs.172182 potytAy-bincCrig protein; cytoplasmic 1 0.139 

310784 AW086142 Hs.159017 ESTs 0.139 

323831 AA335715 Hs200299 ESTs 0.139 

307692 AI318342 EST singleton (not in UniGene) with exon hit 0.139 

310570 AB18327 EST duster (not In UniGene) 0.139 

65 327934 CH.06_hsgi|5868184 0.139 

305232 AA670052 Hs.195188 glyceralciehyda-3i)hosphate dehydrogenase 0.139 

334756 CH22_FGENES.428_5 0.139 

331938 AA451867 Hs.99255 ESTs 0.139 

301393 AI474722 Hs.150898 ESTs; Weakly similar to KIAA0644 protein [Rsapiens] 0.139 
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312005 T78450 H&13941 ESTs 0.139 

338431 CH22_Er^C005500.GENSCAN.351-4 0.14 

331214 T90496 Hs.16757 ESTs 0.14 

333601 CH22_FGENES.213_4 0.14 

5 323481 AA278449 Hs.137429 ESTs 0.14 

338911 CH22_FGENES.344-4 0.14 

338157 CH22_EM'j\C005500.GENSCANm5 0.14 

327845 CH.05JiSfliI5531862 0.14 

319109 Z45662 Hs50797 Homo sapiens done 23620 mRNAsequence 0.14 

10 334763 CH22_FGENES.428_12 0.14 

329384 CKXJisgi|5868869 0.14 

302996 AF054663 EST cluster (not in UnlGeha) wilh exon hS 0.14 

323751 AW452656 H&209824 ESTs 0.14 

329916 CH16_p2gi|6223624 0.14 

15 301993 N49828 Hs.18602 ESTs 0.14 

338129 CH22_EMAC005500.G£NSCAN.197-2 0.14 

325704 CH.14Jisgi]5867028 0.14 

335656 CH22_FGENES590_7 0.14 

331673 W72366 H&40033 ESTs 0.14 

20 316807 A1018331 Hs.172444 ESTs; Highly similar to transcriplion regulator [Mmuscutus] 0.14 

310743 AW449754 Hs.158665 ESTs 0.14 

326941 CR21_hsgl|6004446 0.14 

328809 CH.07_hsgi|5868327 0.14 

323855 AI653164 Hs.128665 ESTs 0.14 

25 304705 AA564064 EST shgleton (not In UniGsne) with exon htt 0.14 

32S666 CH14_hsgi|6469822 0.14 

333747 CH22J=GENES.265_6 0.14 

318287 AW015616 Hs.143321 ESTs 0.141 

332972 CH22_FGENES.51_5 0.141 

30 305704 AA825266 EST singleton (not in UniGane) with exon hit 0.141 

315699 AW182805 Hs.189183 ESTs; Weakly similar to Nodi [lisaptens] 0.141 

327298 CR01_hsgI|5867492 0.141 

336400 CH22J=GENES.823_15 0.141 

321033 H26214 Hs.20733 ESTs; Weakly similar to !!!! ALU SUBFAMILY SX 

35 WARNING ENTRY 0.141 

316522 AI475995 Hs.122910 ESTs 0.141 

335715 CH22_FGENES.599J5 0.141 

335959 CH22_FGENES£50_2 0.141 

333259 CH22_FGENES.118_7 0.141 

40 337382 CH22_FGENES.744-8 0.141 

322346 AA227618 Hs.10882 HMG-box containing protein 1 0.141 

325378 CH.12Jisgi|5866920 0.141 

338500 CH22_EMAC00550O.GENSCANJ9O-1 0.141 

338460 CH22_EMACO0550O.GENSCANJ62-5 0.141 

45 315279 AW511138 Ha256581 ESTs 0.141 

314439 AI53S443 H&137447 ESTs 0.141 

333624 CH22_FGENES.222_3 0.141 

329237 CHXhsgi|58S8729 0.141 

330117 CH.19j)2giI6015201 ai41 

50 338017 CH22_EMACOOSS0O.GENSCAN.134-1 0.141 

337854 CH22_EMAC005500.GENSCANJ8-12 0.142 

329984 CH.16_p2gl|4646193 0.142 

305004 AA622328 Ks.162762 EST - 0.142 

302815 N40373 EST cluster (not in UniGene)with exon hit 0.142 

55 327823 CH.05_hsgij5867968 0.142 

326753 CH.20_hsglj5867616 0.142 

301201 AA904482 Hs.197775 ESTs 0.142 

334303 CH22_R3ENESJ73_6 0.142 

326453 CH.19_hsgl|5867399 0.142 

60 311050 AI884581 H&215477 ESTs 0.142 

308740 AI802711 H&210337 EST ; Weakly similar to aldolase A [Rsapiens] 0.142 

331003 H63959 Hs.142722 ESTs 0.142 

338010 CH22_EMAC005500.GENSCAN.128-8 0.142 

336326 CH22_FGENES.812_4 0.142 

65 318100 R4430B Hs*42302 ESTs 0.142 

320641 R55421 EST cluster (not In UniGene) 0.142 

325855 CH.16_hsg!|5867067 0.142 

330425 HG1728-HT1734 Ncn-SpeciEc Cross Reacting AnOgen (Gb:D90277), 

Afi. Splice Form 2 0.142 
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324583 AM25411 Hs22581 ESTs 0.142 

328288 CH.17_hsgt|5867267 0.142 

331390 AA460341 Hs.45008 ESTs 0.142 

338904 CH22_PJ32I10.GENSCAN.10-16 0.143 

5 333098 CH22_FGENES.79_1 0.143 

331919 AA44S869 Hs.119316 ESTs 0.143 

312214 AE48004 Hs.125187 ESTs 0.143 

323198 AW179174 Hs.7984 ESTs 0.143 

316107 AI204001 Hs.184014 ribosomal protein L31 0.143 

10 301335 AA885317 Hs.190511 ESTs 0.143 

337392 CH22_FGENES.747-3 0.143 

325543 Cai2_hsfli|66824S2 0.143 

305903 AA8730S5 EST singleton (not In UniGene) Witt) exon hit 0.143 

332707 L35594 Hs.174185 phosphodiesterase lmucteotide pyrophosphatase 2 (autotaxin) 0.143 

15 337913 CH22_EfctAC005500.GENSCAN£9-10 0.143 

301436 AA961061 Hs.131696 ESTs 0.143 

335078 CH22_FGENES/S88_5 0.143 

338451 CH22_EM^CO05500.GENSCAN559^9 0.143 

302777 AJ230640 EST cluster (not in UniQene) with exon hit 0.143 

20 330464 J03068 Hs.78223 rfacytanftioacyH>apfide hydrolase 0.143 

330988 H41411 Hs33855 ESTs 0.143 

328939 CH.08_hsgil6004481 0.143 

308015 AI440174 H&228907 EST; Weakly similar to GUANINE NUCLEOTIDE-BIND1NG 
PROTEIN BETA SUBUNIT-UKE PROTEIN 

25 12J[asaplens] 0.143 

328504 CH.07_hsgi|586B471 0.143 

332599 AA402891 H&32951 solute carrier family 29 (nucleoside transporters); member 2 0.143 

335744 CH22J=GENES.601_15 0.143 

322394 AF077208 EST duster (not In UniQene) 0.143 

30 323892 AUM2661 EST cluster (not in UniGene) 0.143 

318443 A1939323 Hs.157714 ESTs; Weakly similar to NEURONAL ACETYLCHOLINE 
RECEPTOR PROTEIN; ALPHA-5 CHAIN PRECURSOR 

[H-sapiens] 0.143 

336568 CH22_FGENES.843_7 0.143 

35 330958 H08815 Hs.159824 EST 0.143 

327672 CH.04Jisgi|5867843 0.143 

335900 CH22_FGENES.635_8 0.144 

336044 CH22_R5ENES£79_6 0.144 

318845 AI815951 H-33183 ESTs; WeaHy similar to estrogen-responsive finger protefo; 

40 efpIHsaplens] 0.144 

333483 CH22_FGENES.165_2 0.144 

333337 CH22JR3ENES.139_6 0.144 

305993 AA889197 EST singleton (not in UniGene) with exon hit 0.144 

335719 CH22_FGENES.E99_22 0.144 

45 325682 CH.14_hsgi)6138923 0.144 

327350 CR01_hsgiJ6249563 0.144 

339291 CH22_BA354I12.GENSCAN.18-1 0.144 

326358 CH.18_hsgil5867293 0.144 

330316 CR08_p2gi]6007576 0.144 

50 308150 AJ499346 Hs.174131 ribosomal protein 16 0.144 

338065 CH22_EMAC005500.GENSCAN.164-1 0.144 

339009 CH22_PA59H18.GENSCAN.18-7 0.144 

327776 Ca05_hsgi|5867964 - 0.145 

336664 CH22_FGENES.41-8 0.145 

55 321921 AF070619 EST cluster (not tn UniGene) 0.145 

319346 T70147 Hs.12024 ESTs 0.145 

304265 AA062892 EST singleton (not in UniGene) with exon hit 0.145 

303818 Z45986 Hs.250178 copinell 0.145 

327498 CH.02_hsgi|6017023 0.145 

60 335227 CH22 F6ENES.513_13 0.145 

339022 CH22_DA59H18.GENSCAN.22-1 0.145 

302597 H55681 Hs.33Q26 ESTs; Weakly slrriar to similar to Enterococcus faecalis 

TRAB[C.etegans] 0.145 

308550 AI697008 Hs.201811 EST 0.145 

65 302175 AA262760 Hs.156015 Homo sapiens chromosome 19; cosmld R29381 0.145 

303252 AA1 56760 EST cluster (not In UniGene) with exon hit 0.145 

337414 CH22_FGENES.757-2 0.145 

310362 AI734009 EST duster (not in UniGene) 0.145 

329333 CHJ_hsgq5868806 0.145 
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325785 CH.14_hsgi]6331957 0.148 

333166 CH22_FGENES.91_8 0.148 

33S548 CH22_FGENES.841_5 0.148 

337552 CH22_C4Q1.GBJSCAN.1-4 W48 

5 331775 AA382742 H_97151 EST 0.148 

338936 CH22_W32l10.GBJSCAN.19-6 0.148 

331669 AA428554 Hs.104894 ESTs; Weakly similar to fibronactin precursor [Rsaplens] 0.148 

332865 CH22_FGENES.28_5 0.148 

32S663 Ca07_hsgt|6004473 0-148 

10 323436 CR07_hsg!|5868417 0.148 

311158 AI634864 H&250789 ESTs; Highly similar to similar to NEDD-4 [H.sapiens] ai48 

335942 CH22_FGENES.354-2 0.148 

302262 R53169 H&246091 ESTs 0.149 

333296 CH22_FGENES.132_3 tt149 

15 333365 CH22_FGENES.142_2 0.149 

311706 AW452392 Hs.252854 ESTs 0.149 

337109 CH22_FGENES.489-2 ai49 

315062 AW173300 Hs.190201 ESTs 0.149 

333454 CH22_FGENES.157_3 0.149 

20 334784 CH22_FGENES.432_9 0.149 

333255 CH22_FGENES.118_3 0.149 

337518 CH22_FGENES.814-7 0.149 . 

320651 AA489268 EST cluster (not In UniGene) 0.149 ~ 

323437 AA287567 EST cluster (not in UniGene) 0.149 

25 328761 CH.07_hsgi|5868302 0.149 

328787 CHJJ7Jisgi|58S8309 0.149 

335261 CH22_FGENES.520_2 0.149 

300827 R16689 Hs.106004 ESTs 0.149 

339263 CH22_BA354I12.GENSCAN.10-1 0.149 

30 337412 CH22_FGENES.756-6 0.149 

334414 CH22_FGENESJ84_1 0.149 

332931 CH22_FGENES.38_5 ai49 

310801 AW270980 Hs.106346 novel centrosomal protein RanBPM 0.149 

305216 AA66905S EST singleton (not in UniGene) with exon hit 0.149 

35 314779 AA470122 Hs.190261 ESTs 0.149 

338414 CH22_EM^aXK500.GENSCAN341-27 0.149 

303342 AW247361 EST cluster (not in UniGene) with exon hit 0.149 

337509 CH22_FGBIESJ064 0.149 

306631 AI001149 EST singleton (not In UniGene) with exon hit 0.149 

40 302533 136149 H&248116 chemoHne(C motif) XC receptor 1 0.149 

336536 CH22_FGENES.839_18 0.149 

324666 T32458 Hs.14285 ESTs 0.149 

310173 AI767433 Hs.170013 ESTs 0.149 

333595 CH22_FGENES.211_2 0.149 

45 335975 CH22_FGENES.652_9 0.15 

306654 AI003654 EST singleton (not in UniGene) wilh exon hit 0.15 

335025 CH22_FGENES.475_3 0.15 

328711 CHX)7_hsgil5668271 0.15 

328274 Ca07Jisgi]5868219 0.15 

50 325505 CR12_hsgi 6682451 0.15 

329641 CR14_p2gij6468233 0.15 

304955 AA613504 EST singleton (not in UniGene) with exon hB 0.15 

339103 CH22_DA59H18.GENSCAN.44-10 - 0.15 

329636 CR12_p2gi]5302817 0.15 

55 310118 AI203293 Hs.157489 ESTs 0.15 

326056 CH17_hsgi]5867184 0.15 

303773 AA769074 EST cluster (not In UniGene) with exon hit 0.15 

303153 U09759 Hs.8325 mttogen-aetivated protein kinase 9 0.15 
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TABLE 13A shows the accession numbers for those prirnekeys lacking unigenelD's for 
Table 13, For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Ptey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession:' Genbank accession numbers 



Ptey CAT number Accession 



10 



15 



20 



25 



30 



35 



40 



45 322533 38937J 
321921 34680J 
321927 21620J 



50 



55 



60 



65 



322050 24275 1 AL137589 AA423949 BE222949 BE222694 AI19961S AW8731 16 AI277950 AWO44290 AW630096 
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BE619S17 BE388097 BE26402S BE618945 BE614758 BE312249 BE294359 BE531121 BE622300 BE81S109 BE544354 
BE61499B BE393239 BE297520 BE393221 BE278818 BE279309 BE26S476 BE618772 BE6151B5 BE265144 BE249837 
BE312230 BE407843 BE253884 BE407845 BE615804 BE619058 BES59512 BE383249 BE613497 BE294351 BE295062 
BG622385 BE390654 BE535438 BE5S3188 BE396374 BE270842 BE386110 BE260388 BE250188 BE26S87S BE537229 
BE2S3369 BE2S6997 BE269482 BE2S4959 BE279072 AA662160 BE28C733 AA8S8428 BE561308 BE267285 BE561422 
BE563181 BE304614 BE295437 BE619424 BE275863 BE394315 BE408109 BES41866 BE253772 BE61B238 BE535261 
BE296490 BE278212 BE563154 BE25724S BE262274 BE513032 BE378567 BE3941S2 BE618947 BE269302 BES46516 
BE536792 BE615187 BE261186 BE615367 BE619289 BE261184T49376 AL031671 BE273400 BE563457BE545597 
BE615169 AA150323 M158723 AA078033 BE313333 AA160100 BE27111S BE294302 BE273051 BE273048 BE622390 
AA837947 BE387721 AW973277 AA808731 BE280792 AA1 60444 BE255723 AI745B0 AA64301 7 BES49441 BE293858 
AW975249 A1620319 AW089494 AI434549 BE305231 AA081262 BE280101 AA522507 AB50880 AA187460 BE386860 
AW359229 BE170489 BE620149 BE548218 AA31 6696 AA484426 A1S67740 AA1 60605 AW93980S AA089573 BE300194 
BE391331 AW975419 H26808 BES45544 BES15974 AW800241 BE616222 W17343 BE387885 T53697 C03943 BE817637 
BE315130 T52942 T50588 N74693 AA187107T59919 AW7B7397 AA206447 AA854619 T57175 AI570296 AW517964 
AA158269 AI282220 W2S297 AIS80710 BE262453 AI185868 AA52648S AI288051 A1582513 AA100675 AW615567 
BE395354 AI472725 BE314881 BE621281 N99921 AI282689 AI43272S AW73201 1 AA8722S4 BE205807 T59435 A1282712 
AA650505 AI004374 AA725260 BE313161 T60173 AI371280 BE385641 AW751812 AA078827 A1491858 A1433S22 
AA219118AI002092AA996003AA064604A1250287A1304397AI453213AAB53630A^ 
AA715629 AW973783 AA932493 A1347563 M181309 T67880 AA643033 AW467498 AA11S904 AA935410 AA483032 
AA084568 W25246 AB67588 AA1SS732 AA158614 AA888319 AA1S8568 AA188422 AB09183 AA084817 M157995 
AI859659 AA188008 AE87379 AIS4067S M08S212 AW028391 AA173297 BE256792 AA182854 BE378771 BES38571 
AA079037BE281597AAS43926W81011 AA159344AA320691 AA877597 T57107 AW263819 A1690413 A1619505 A1687579 
AA970S60 AB68942 AI927104 AW419220 AI620051 AA128490 AA120825 AA07952O AA199648 AW188403 BE045224 
AW265533 AA074338 AA102685 AW779399 AA192451 AA182771 AW366812 BE281418 AA21 1094 AA131073 AA487924 
AW674848 AI568103 AA171934 F30349 AW088785 AA581370 AA205482 AW352298 AW517565 AB76249 M158884 
AI340509 T53965 AA085193 AA071S70 A1874045 AA8S2755 BE045217 AW189428 AA21 1141 AAS52134 AI497729 
AA994817 AI811459 BE53S857 AW769897 AW167892 AW149305 AI864981 AW272126 AW023245 A1439268 AI953198 
AA160912 AI718580 BES37S47 AA501448 AA069308 L07393 AA353007 AA079235 AI539140 AA740154 W58341 AA888403 
BE299000 AA196413 BE613327 BE281523 AA866599 AW844713 AI691 159 A1079976 AW327479 BE180731 AA984805 
AW500732AW504061 
AA774672AWS04164 
AA769074 AA570769 AA808585 AA808882 
AW505368 AA218610 F11852T65345 AA397808 
BE29771 1 AWS0S574 AA704983 
F07942T08033 

BE386266 BE148823 T2321S AI90S290 AA299906 BE2071 97 AW0741 1 4 AI760368 A1005358 AW662201 M188988 
AI690711 AA775103AW072931 AI584269 AW129364 AW615834 A1049941 AW874040 A1352633 AA188989 AI28777S 
AA868774AAS99660 
AA780365 AA909233 AI275542 
AA210878AA215684R11101 

M13560 AA336951 AA161015 R72814 T69687 R75705 T61319 AA158454 R50579 T56649 AI214156 T7037S R31655 
H64997 AW800487 H491 10 AA634206 H42384 H21783 AI560152 AA664230 H42302 R48708 AA013277 T61901 T92417 
AA87598S T61962 T63055 AA430725 AA458964 AAS78746 AI582385 T63000 AI49987S H64998 AA022538 AI364804 
AI86S21 1 AM39714 AB24059 A1249917 T592S8 AA47780S AA715834 AA916120 R38304 R35899 RB2985 H25524 H82984 
AW516728 T54642 AA079868 H27555 AA455820 T63919 R79450 AI431241 AA937349 AA127213 AA421729 H611S6 
T63894 AA013050 AA079133 W9S364 AA487926 AI76278S H26377 A1433388 AI8SS423 AW371475 R98189 AA643978 
AI718204 AW381954 AI862735 
319638 226485J AA3237S8 R12731 R14082 

320257 183534J R17531 AW960899 AA338368 AW673294 BE047729 BE047722 AA330746 AW841797 H05C30 All 42105 R12654 
320289 115941 1 H07989 AJ239462 H24544 AA078369 R74153 

304703 33971 42 BE512926BE304794M129140AA052922AA092258BE378058BE615391BE615218BE816188^^ 

W56857 AI028525 BE617241 BE531271 AW856227 T56489 AA322005 AW794148 AF170577 BE615738 AAD05138 L76930 
L76932 176933 X95410 AW389462 BE563032 AW997937 AA263158 AI520992 AW947350 AA522535 AW945921 AV653778 
AW884835 AW947338 AI687178 AW9457B9 A190S627 AW948449 AV653751 AW945924 AA563898 AW945810 AW945832 
AW371449 AW945864 AW948447 AW945910 AA643002 AAS22680 AA522715 AA578840 AA52327S AAB26150 AW945809 
AW405998 AA551909 R23173 AA595545 AW389497 AI933770 AI125053 AI471 803 AW795856 AW796937 W30675 H7031 7 



303701 1155179J 
303759 447287J 
303773 356632J 
303778 174437.1 
303784 414659J 
303845 50211.2 
303898 162688J3 



20121 452027J 
319590 171338J 
305186 17456.1 
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321039 25338 J 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



306051 1S085J3 



321163 171122.1 
321235 1102181.1 
320603 4297.1 



320641 185591J 
320651 58648.1 
321325 28266.1 

305704 464759.-1 

322011 23158.1 

306407 

306454 

306516 

306518 

306526 

306534 

306590 

306591 

306631 

306654 

306786 

306799 

308023 

308070 

308099 

306805 

306814 

306873 

306911 

306982 

308238 

308258 

308289 

308311 

308332 

308511 

308601 

308612 

308636 

308814 

308851 

308981 

310570 1071946.1 
305022 
305060 
305070 



H68296T59240 AA397650 H58852 AA938072 AA978010 R35643 T89735 AW361585 AW196153 AI538069 AA604540 
AI434259 R49181 T58717 AW062486 AW796966 AI6483B4 R77733 A1623502 BE171342 BE171303 R35656 AW974883 
AW149898 AI500045 AI540710 AI540392 AW009172 AW277199 AB71312 AB00098 AI470297 AW372940 AWB44562 
AW844560 AW797965 AI691146 X07062 AW799199 K60688 AA837684 AF130734 T25952 AI933771 AI914860 AW391825 
AW783843 AW795012 AW366709 AW750987 AW750985 R35765 AW844942 AW750986 H64920 R34651 X86703 
BE018103 BE018083 BE293253 AW247083 BE207643 BE514793 BE183238 AA376427 AW2738S0 AW043788 BE439973 
AL045428 AI889050 AAD26498 AI422924 A1884485 W96068 AAD20872 F371 19 AA714378 AA021107 AA011141 AI554001 
A1375841 AI469097 AA335219 AW967315 AI692177 AA410448 AI568858 AA582647 AA026419 AA281639 AW515248 
AW007777 AA010840 AW188439 AI805423 AH4B210 BE301590 AA744414 AA745392 AW167423 AA622659 AW0OO878 
AM32387 AA760930 BE047189 AA021605 AV65B045 AI093347 AA588594 HS3143 AAS39556 A1308978 AA379270 
AA633407 AI874329 AI206484 AM93895 AK94103 AI249682 AA973765 AA872445 AI125446 AA287272 AW069761 
AA682569 AW009712 BE542774 R50167 BE301574 AA991202 AA502006 AE19819 AW074373 AA617996 AI521242 F2S241 
AW615812 R16774 AA335218 AW673800 H26778 AM68557 AI886986 AI560759 AI460075 AA502968 AA503273 AA610680 
AA287274 AA554020 AA284889 AA916636 AW469457 AW273250 AW673708 AW512948AL041071 AI446042AA903535 
BE172441 AI282411 AW265021 AA810799 AB53865 AA729332 AW00461 1 AW129451 AA659019 BE206239 AA610825 
H03511 BE383995 R16474 AA281701 AW009244 AA287424 AA558139 AW364081 

F08147 AW4083S9 AW949429 R23785 AW247442 AA305512 T29095 AA905130 BE246361 BE244981 AA220199BE504058 
X80B78 AA533727 AA608601 AW005964 AI81 1827 AQ67037 AE77985 AI493719 AK77848 AA854982 AW247298 AI216345 
AI041285 AB87378 AA781241 AB74270 AW628959 AB83083 BE504391 AA729421 AA552188 AA373387 AW880360 
AW875262 AW875369 AW581540 AW875358 AW581568 R23735 AW134768 
W03912 AW971410 AA506385 AA209530 H73495 H48629 W56149 
HS6752AYV340384N48521 

AA853680 AK001668 BE386425 BE563549 BE296124 BE298950 R51419 U46295 BE147292 AA360056 R48018 AW845348 
N47383 AI8172B0 AI671902 AA988104 AA479484 N56996 AI192374 AI927558 AA6S9888 AI789903 AA548397 AI161167 
A1656333 AI418829 AW592671 BE327906 AW513346 AI3BS579 AW469410 AW512809 D25682 AA576079 AA479354 
T30342 R51307 T16044 H290S3 AW079357 AB39477 R47914 AB86068 AI870065 AI868489 AI521099 AI582732 AAS95540 
AW957299 AA352608 AA676752 AA410510 AA358874 AI865724 AA853679 AI699265 AW188789 N47380 AA233715 
BE258194 R55421 R55643 H42362 AA243884 

AW886407 AA489268 R57015 R58094 BE077459 BE077423 BE546995 AW849216 T69383 AW9381 1 1 H60337 BE221073 
AB033100 AA347036 BE260325 AW961 669 AL047207 AA347037 AI766894 AA601045 AI559897 AW139033 AW274622 
AW172884 AW089070 AA804340 AW798925 
AA825266 

AL137354AL043375 
AA971985 
AA977992 
AA989542 



AA989713 
AA991487 
AI000246 
AI000248 
AI001149 
AI003654 
AKJ41589 
AI051698 
AI452732 
AI470948 
AI475914 
AI055966 
AI066577 



AI095365 
AI127883 
AI559492 
AI565612 
AI571211 
AI581855 
AI591235 
AI6B7580 
AI719930 
AI735634 
AI744063 
AI819263 
AI829820 
AI873242 

AI318327 AB18328 AI318495 

AA627416 

AA635771 

AA639783 
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305079 
305134 
303977 
305216 
305263 
305266 
305396 
305403 
305488 
305549 
305601 
305610 
305621 
305710 
305724 
305744 
305752 
307018 
307055 
307058 
305801 
305830 
305836 
305852 
305858 
305866 
305867 
307128 
305903 



AA641329 

AA653159 

AW512978 

AA669056 

AA 679467 

AA679772 

AA721052 

AA723748 

AA749000 

AA773530 

AA780975 

AA782319 

AA789035 

AA826544 

AA827608 

AA831819 

AA835278 

A1140639 

AI148477 

AI148709 

AA845997 

AA857665 

AA858043 

AA862455 

AA863103 

AA864533 

AA864572 

AI184951 

AA873085 



323803 Q_7Jis 
328809 C_7Jis 
305949 AA884409 
328829 c_7Jis 
330021 Cl6_p2 
330024 c16_p2 
330028 Cl6_p2 
330049 c17ji2 
305993 AA889197 

330095 Cl9_p2 

330096 c19_p2 

307205 AI192479 
307427 A1243437 
307491 A1268539 
307581 Ai284415 
307588 AK85535 
337672 CH22_60O2FQ_UNieBtAC0O 
337693 CH22_6030FG_UNK_aiAC00 
337738 CH22_6083FG_JJNK_EMAC00 
307692 A1318342 
307806 A1351739 
309107 AI925823 
309230 AI970747 
339338 CH22_8300FG_UNK_BA354I1 
309257 A1984183 
309366 AW072970 
309422 AW087175 
325207 clOJis 
325257 clljis 

309646 AW194694 
309651 AW195850 
325313 clljis 

309924 AW340812 
334030 CH22_1308FQ.S20J_UNK_ai 
334040 CH22_1318FCL322_8JJNK_EM 
334083 CH22_1361FG_327_38_UNK_E 
332810 CH22 _26TO_7_12_UNK_C65E1 
302747 32813J AF062275 LO3830 
302753 33029 1 M74299M74302M74303 
302777 33B03J AJ230640AJ230648 
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304094 
302824 



325870 
304240 
304410 
304443 
304475 
304522 
304678 
304705 
306004 
306008 
306013 
306082 
336174 
306034 
304823 
304872 
304918 
304955 
306249 



35372J 
41188J 
c16_hs 



H11295 

U21260U21258 
AF054663 AF124197 R70292 



AA009802 
AA284508 
AA399444 
AA428879 
AA465405 
AA548556 
AA564064 
AA889992 
AA894390 
AAB96990 
AAS08508 
CH22_3567FGL710_1JJNKJJA 
AA908377 



306295 
306317 
306347 
306365 



330401 
330463 



330535 
332634 



entre?JD28383 
AWJ. 



1374_-8 
10404J 



AA584837 
AA595289 
AA602697 
AA613504 
AA933340 
AA936892 
AA937331 
AA947909 
AA981144 
AA962086 
AA970548 
D28383 

NMJ0O1O55 AA332948 U2S309 U09031 L19955 L10819 AB66043 X84654 U71086 AV654451 AJ007418 AA053625 
BE168856 AA376730 H12694 AA810348 AA621972 AI818950 AV645367 A1819966 AA910602 AW512449 H67893 AI310497 
AI304330 AI339217 AW193588 AW438688 AI818970 AW316799 AA906527 AA777570 N47673 AI336428 AW945133 
AI038606 R29692 AW194197 AI304748H12639 AA053178 AA493213 AA676958 AA1 13154 AI313469 AB68239 R93183 
W24532 U52852 U54701 AL046864 AA365795 
U11872 

U24488NM.007116 
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TABLE 13B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 13. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey. Unique number corresponding to an Eos probeset 

Rat Sequence source. The 7 digit numbers In this column are Ganbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLposfflon: Indicates nucteotkte positions d predicted exons. 



Pkey Ref 

332791 Dunham, LetaL 

332792 Dunham, LetaL 
332810 Dunham, LetaL 
332944 Dunham, LetaL 
332972 Dunham, LetaL 
333133 Dunham, LetaL 

333154 Dunham, LetaL 

333155 Dunham, LetaL 
333227 Dunham, I. etaL 
333230 Dunham, LetaL 
333298 Dunham, LetaL 

333304 Dunham, LetaL 

333305 Dunham, LataL 
333365 Dunham, L etaL 
333383 Dunham, L eLaL 

333391 Dunham, LetaL 

333392 Dunham, LetaL 
333397 Dunham, I. etaL 
333403 Dunham, L etaL 
333413 Dunham, I. etaL 
333445 Dunham, LetaL 
333479 Dunham, I. etaL 
333481 Dunham, LetaL 
333483 Dunham, LetaL 

333516 Dunham, LetaL 

333517 Dunham, LetaL 

333518 Dunham, I. etaL 
333531 Dunham, I. etaL 
333566 Dunham, LetaL 
333572 Dunham, LetaL 
333588 Dunham, LetaL 
333588 Dunham, LetaL 

333594 DunharaLetaL 

333595 Dunham, I. etaL 

333600 Dunham, LetaL 

333601 Dunham, LeLaL 
333607 Dunham, I. eLaL 

333612 Dunham, LeLaL 

333613 Dunham, I. etaL 

333614 Dunham, LeLaL 
333624 Dunham, I. eLaL 
333626 Dunham, I. eLaL 
333635 Dunham, I. etaL 
333637 Dunham, I. eLaL 
333642 Dunham, LetaL 
333647 Dunham, LetaL 

, 333653 Dunham, I. eLaL 
333654 Dunham, LetaL 

333656 DunharaLetaL 

333657 Dunham, LetaL 

333658 Dunham, LeLaL 



Strand NLposRJon 


Plus 


72720-73315 


Plus 


73381-73768 


Plus 


304295404384 


Plus 


2414325-2414932 


Plus 


2572152-2572238 


Plus 


33600584360195 


Plus 


3615887-3616019 


Phis 


3616832-3617003 


Phis 


39928664992968 


Phis 


3995507-3996507 


Plus 


4581537-4581947 


Plus 


46299434630242 


Plus 


46303884630645 


Plus 


4786883-4787283 


Phis 


4907179-4907277 


Plus 


4916697-4916780 


Plus 


49182944918433 


Phis 


4922466-4922635 


Plus 


49251404925256 


Plus 


4943824-4943974 


Phis 


5097827-5097885 


Plus 


52728554272939 


Phis 


52863584286505 


Phis 


S2979454298105 


Plus 


55702044570390 


Phis 


55707294570925 


Phis 


55717614572025 


Plus 


56226224622684 


Plus 


59542264954473 


Phis 


60268984027189 


Plus 


62468344247314 


Phis 


62554454255779 


Phis 


63089904309450 


Plus 


63231034323348 


Plus 


63556294355925 


Plus 


63600754360442 


Plus 


65044314504690 


Plus 


65495834549697 


Plus 


65506434550748 


Plus 


65512274551389 


Pius 


65951464595244 


Plus 


66141744614467 


Phis 


66636834663973 


Plus 


66749S8467S134 


Plus 


67087604709139 


Plus 


67725024772779 


Plus 


68111304811392 


Plus 


68167314816993 


Plus 


68220874822406 


Plus 


68313694831445 


Plus 


68352824835474 
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333659 Dunham, I. eLaL Plus 
333684 Dunham, LetaL Plus 
333686 Dunham, I. etaL Phis 
333697 Dunham. I. eLaL Phis 
5 333698 Dunham, Letal Plus 
333699 Dunham, L eLaL Plus 
333703 Dunham, LetaL Plus 
333709 Dunham, L eLaL Plus 
333747 Dunham, LetaL Plus 

10 333774 Dunham, LetaL Plus 
333775 Dunham, LetaL Plus 
333806 Dunham, LetaL Plus 
333843 Dunham, LetaL Plus 
333854 Dunham, LetaL Plus 

IS 333873 Dunham, L eLaL Phis 
333880 Dunham, LetaL Phis 
333885 Dunham, I. etaL Phis 
333918 Dunham, LetaL Phis 
333947 Dunham, LetaL Phis 

20 333961 Dunham, LetaL Phis 
333981 Dunham, LetaL Plus 
333991 Dunham, LetaL Plus 
333994 Dunham, LetaL Phis 
334030 Dunham, LetaL Plus 

25 334083 Dunham, LetaL Phis 
334111 Dunham, L eLaL Phis 
334135 Dunham, LetaL Plus 
' 334218 Dunham, L eLaL Plus 
334249 Dunham, LetaL Phis 

30 334262 Dunham, LetaL Phis 
334264 Dunham, LetaL Phis 

334327 Dunham, I. etaL Phis 

334328 Dunham, I. etaL Pus 
334340 Dunham, Letal. Plus 

35 334454 Dunham, Letal. Plus 
334504 Dunham, LetaL Plus 
334508 Dunham, I. eLaL Plus 
334512 Dunham, I. etaL Plus 
334582 Dunham, I. eLaL Plus 

40 334659 Dunham, I. eLal Phis 
334721 Dunham, LetaL Phis 
334723 Dunham, L eLaL Phis 
334730 Dunham, I. etaL Plus 
334774 Dunham, I. eLaL Plus 

45 334778 Dunham, I. eLaL Plus 
334851 Dunham, LetaL Pius 
334885 Dunham, I. etaL Plus 
334902 Dunham, L eLaL Plus 
334905 Dunham, L eLaL Plus 

50 334906 Dunham, I. eLaL Plus 
334910 Dunham, LetaL Plus 
335018 Dunham, LetaL Plus 
335025 Dunham, LetaL Plus 
335033 Dunham, L etal. Plus 

55 335044 Dunham, L eLaL Plus 
335142 Dunham, LetaL Phis 
335157 Dunham, LetaL Plus 
335160 Dunham, LetaL Plus 
335174 Dunham, LetaL Plus 

60 335188 Dunham, I eLaL Phis 

335190 Dunham, LetaL Plus 

335191 Dunham, LetaL Phis 
335193 Dunham, I. etaL Phis 
335204 Dunham, LetaL Phis 

65 335222 Dunham, I. etaL Phis 

335226 Dunham, I. etaL Phis 

335227 Dunham, I. etaL Phis 

335309 Dunham, L eLaL Phis 

335310 Dunham. LetaL Plus 



683617*6836248 

7169581-7169742 

7177117-7177302 

7203859-7203934 

7205279-7205383 

7206101-7206175 

7215559-7215663 

7229730-7229835 

7605884-7606206 

7716509-7716636 

7729983-7730149 

7877475-7877668 

7978762-7978887 

80294484029524 

81332664133429 

81519234152133 

81543524154437 

83071244307215 

85798884579966 

86179994818104 

87823744782643 

88374194837551 

88527494852894 

92884634288782 

98370164837081 

10279365-10279531 

10457085-10457183 

12680289-12680378 

13190430-13190574 

13231452-13231581 

13234447-13234544 

13577413-13577496 

13589868-13589936 

13642407-13642522 

14326506-14326738 

14510206-14510398 

14514936-14515122 

14545933-14546366 

15026255-15026371 

15460624-15460726 

15796816-15796987 

15805317-15805399 

15967830-15967934 

16251B57-16252178 

16276180-16276395 

17820110-17820810 

19233667-19233787 

19317083-19317195 

19322553-19322680 

19323493-19323590 

19398155-19393684 

20688288-20683415 

20743941-20744050 

20753188-20753314 

2084208*20842682 

21465105-21465186 

21543302-21544341 

2157338*21573497 

21631301-21631447 

21669118-21669328 

21680807-21680876 

21681110-21681183 

2169220*21692362 

2175063*21750726 

21885542-21885608 

2189083*21890930 

21892145-21892289 

2250015*22500276 

22500714-22500831 
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335311 Dunham, I etaL Plus 

335355 Dunham, L etaL Plus 

335362 Dunham, LetaL Plus 

3353SS Dunham, L etaL Plus 

335384 Dunham, I. etaL Plus 

335385 Dunham, L etaL Phis 
335438 Dunham, I. etaL Plus 

335440 Dunham, LetaL Plus 

335441 Dunham, LetaL Plus 
335450 Dunham, LetaL Phis 
335453 Dunham, LetaL Plus 
335458 Dunham, LeLaL Plus 
335484 Dunham, LetaL Plus 

335496 Dunham, I. etaL Plus 

335497 Dunham, 1. etaL Phis 

335498 Dunham, LetaL Plus 

335499 Dunham, I. etaL Plus 

335500 Dunham, LeLaL Plus 
335507 Dunham, I. eta! Plus 
335510 Dunham, LetaL Plus 
335513 Dunham, LeLaL Plus 
335627 Dunham, LetaL Plus 
335651 Dunham, LelaL Plus 

335655 Dunham, LetaL Plus 

335656 Dunham, LetaL Puis 
335658 Dunham, LeLaL Plus 
335663 Dunham, I. etaL Plus 
3356S5 Dunham, 1. etaL Plus 

335667 Dunham, LetaL Plus 

335668 Dunham, LetaL Plus 

335689 Dunham, I. etaL Plus 

335690 Dunham, I. etaL Plus 
335715 Dunham, LelaL Plus 
335719 Dunham, LetaL Plus 
335734 Dunham, LetaL Plus 
335744 Dunham, LetaL Plus 
335809 Dunham, LetaL Plus 
335819 Dunham, LetaL Plus 
335822 Dunham, LetaL Plus 
335872 Dunham, LelaL Plus 
335885 Dunham, LetaL Plus 

335988 Dunham, LetaL Phis 
335971 Dunham, LetaL Plus 

335975 Dunham, LetaL Phis 

335976 Dunham, LelaL Plus 

335989 Dunham, LelaL Plus 

335990 Dunham, L etaL Plus 
336010 Dunham, LetaL Plus 
336093 Dunham, LetaL Plus 
336126 Dunham, LelaL Plus 
336129 Dunham, LetaL Plus 

336187 Dunham, LelaL Plus 

336188 Dunham, LetaL Plus 
336225 Dunham, LetaL Plus 
336371 Dunham, LetaL Plus 
336373 Dunham, I. etaL Plus 
336377 Dunham, L etaL Plus 
336380 Dunham, I. elal. Plus 
336333 Dunham, LelaL Plus 
336384 Dunham. I. elal. Plus 
338385 Dunham,!. elal. Plus 
336386 Dunham, I. etal. Plus 
336441 Dunham, L etal. Plus 
336444 Dunham, I. etal. Plus 
336484 Dunham, LetaL Plus 
336497 Dunham, L etal. Plus 
336499 Dunham, LetaL Plus 
336503 Dunham, LelaL Plus 
336548 Dunham. LelaL Plus 



22501602-22501678 
22779222-22779516 
22809167-22809461 
22843040-22843184 
22918150-22918263 
22918072-22918339 
23427793-23427923 
23458702-23459017 
23460632-23460724 
23480190-23480270 
23483333-23483459 
23490034-23490143 
23500331-23500496 
24164386-24164545 
24167666-24167869 
24172082-24172181 
24176698-24176869 
24178236-24178326 
24219973-24220039 
24222975-24223118 
24224272-24224498 
25150005-25150061 
25317560-25317696 
25333211-25333369 
25333601-25333751 
25336315-25336406 
25342680-25342802 
25344098-25344287 
25345735-25345856 
25346313-25346447 
25454350-25454604 
25455442-25455625 
25565941-25566052 
25593936-25594101 
25688723-25688869 
25716483-25716615 
28310772-26310909 
26356341-26356470 
26364087-26364196 
26820760-26820943 
28933436-26933534 
27743843-27744029 
27752803-27753017 
27801321-27801391 
27809041-27809187 
27983783-27983860 
27988532-27988608 
28570239-28570330 
29556922-29557002 
30057891-30058105 
30062259-30062348 
30433494-30433585 
30434870-30435004 
30833614-30833788 
33968108-33968204 
33976308-33976504 
33994489-33994599 
33995323-33995434 
34005784-34005964 
34007429-34007559 
34007879-34008159 
34012965-34013115 
34187606-34187663 
34190585-34190718 
34237425-34237505 
34267190-34267245 
34267504-34267572 
34271306-34271372 
34353881-34354826 
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20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



336857 
336911 



336950 



336552 
336553 
3365S7 
336568 
5 336659 
336715 
336803 



10 



15 336993 
337076 
337109 
337123 
337151 
337189 
337241 
337337 
337353 
337384 
337396 
337414 
337418 
337461 
337480 
337482 
337483 
337490 
337522 
337532 
337552 
337584 
337611 
337672 
337693 
337738 
337926 
337927 
337935 
337944 
337954 
337996 
338004 
338016 
338174 
338176 
338238 
338277 
338294 
338316 
338323 
338324 
338386 



338410 
338414 
338460 
338481 
338489 
338500 
338514 
338530 
338620 
338631 
338653 



Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, total. Plus 
Dunham, I. eUL Phis 
Dunham, LeUI. Phis 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, L etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Pais 
Dunham, LetaL Phis 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, L etaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, LetaL Phis 
Dunham, LetaL Phis 
Dunham, L etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, t etaL Phis 
Dunham, L etaL Phis 
Dunham, LetaL Phis 
Dunham, LataL Phis 
Dunham, LetaL Phis 
Dunham, tetat Phis 
Dunham, LetaL Phis 
Dunham, L etaL Plus 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, I. etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, L etaL 
Dunham, L etaL 
Dunham. LetaL Plus 
Dunham, LetaL Plus 
Dunham, I. eLaL Plus 
Dunham, L etaL Phis 
Dunham, I. etaL Plus 
Dunham, I. etat Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Phis 
Dunham, I. etaL Plus 
Dunham, I. eLal. Plus 
Dunham, I. eLaL Plus 
Dunham, L eLal. Phis 
Dunham, I. eLal. Plus 
Dunham, LetaL Phis 
Dunham, L eLaL Phis 
Dunham, L eLaL Plus 
Dunham, LeUL Phis 
Dunham, I. eLaL Plus 
Dunham, L eLaL Plus 



Plus 
Plus 
Plus 
Plus 
Plus 



34356420-34356527 

34356683-34356753 

34428228-34428395 

34428521-34428637 

1896402-1896478 

31101984110314 

6106904-6106990 

6126661-6126786 

7745284-7745355 

8130457-8130612 

11035818-11035984 

12818687-12818891 

12875843-12875912 

13203550-13203973 

15096270-15096324 

19338177-19338679 

21166580-21166650 

22052874-22052942 

2310643343106510 

24225887-24225954 

27280182-27280313 

3039518240395285 

30804624-30804780 

31333399-31333580 

31585902-31586067 

31953012-31953205 

3201404942014131 

3280396842804028 

3321971443219779 

3322786543227946 

3323729243237427 

3331857143318644 

3396318843963979 

3418725944187366 

19497-19600 

945236445452 

1482883-1483016 

33312364331313 

35759754576153 

38657334865814 

6266377-6286470 

63430334343172 

65346614534782 

65893834589450 

68314834831620 

7445532-7445633 

7601363-7601520 

7863131-7863310 

12771102-12771268 

12774072-12774223 

14661936-14662015 

16167622-16167962 

16463958-16464539 

17089711-17089988 

17154655-17154792 

17155309-17155574 

18811213-18611407 

18953492-18953581 

19292807-19292916 

19345573-19345660 

20233372-20233488 

20942659-20942873 

21142605-21143049 

21253847-21253974 

21379420-21379655 

21636361-21636509 

23540239-23540334 

23711167-23711241 

24219427-24219509 
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338660 Dunham, I. etaL Plus 24337122-24387266 

338704 Dunham. Letal Pius 25230432-25230548 

338847 Dunham, I. elaL Plus 27995337-27895420 

338887 Dunham, I. etaL Plus 28465244-28465384 

5 338895 Dunham, I. etaL Plus 28598893-28599135 

338915 Dunham. Letal Plus 28824881-28824977 

338925 Dunham, Letal Plus 28883892-28884036 

338936 Dunham, L etaL Plus 29148022-29148160 

338952 Dunham, Letal Phis 29418831-29418988 

10 338980 Dunham, letal Plus 28896789-29898874 

338981 Dunham, Letal Plus 29897917-29898008 

338988 Dunham, Letal Plus 30007287-30007415 

339009 Dunham, Letal Plus 30348477-30348598 

339017 Dunham, L etaL Plus 30420898-30421090 

IS 339045 Dunham, I. etaL Plus 30744288-30744358 

339046 Dunham. Letal Plus 30746269-30746420 

339059 Dunham, Letal Plus 30814655-30814801 

339067 Dunham, Letal Plus 30869347-30869412 

339069 Ounham, I etaL Phis 30880975-30881070 

20 339078 Dunham. L etaL Phis 30914311X30914423 

339084 Dunham. L etaL Phis 30944556-30944803 

339101 Dunham, Letal. Plus 31158047-31158123 

339102 DunhanUetal. Phis 31169321-31169583 

339103 Dunham, letal. Phis 3117034*31170454 
25 339115 Dunham, Letal. Phis 31459869-31459927 

339157 Dunham, Letal. Phis 32131701-32131833 

339166 Dunham, Letal Plus 32210902-32211008 

339167 Dunham, L etaL Plus 32213567-32213730 
339288 Dunham, L etaL Phis 33169611-33169691 

30 339289 Dunham, Letal Plus 33186756-33186903 

339291 Dunham, Letal Plus 33205057-33205247 

339407 Dunham, L etaL Plus 34189461-34189620 

332865 Dunham, Letal. Minus 1391482-1391218 

332881 Dunham, Letal. Minus 1563520-1563184 

35 332930 Dunham, Letal. Minus 2022565-2022497 

332931 Dunham, Letal. Minus 2023651-2023562 

332984 Dunham, Letal Minus 2632606-2632457 

332986 Dunham, Letal. Minus 2635398-2635206 

332997 Dunham, Letal Minus 2710509-2710375 

40 333051 Dunham, I. etaL Minus 2991973-2991840 

333081 Dunham, Letal Minus 3029631-3029527 

333064 Dunham, Letal. Minus 3030722-3030623 

333096 Dunham, Letal. Minus 3184234-3184118 

333099 Dunham, t etaL Minus 3206786-3206674 

45 333106 Dunham, I. etaL Minus 3230744-3230547 

333160 Dunham, Letal Minus 365489*3654678 

333163 Dunham, I. etaL Minus 3665124-3564962 

333165 Ounham, L etaL Minus 3674052-3673905 

333166 Dunham, Letal Minus 3694664-3694587 
50 333170 Dunham, L etaL Minus 3733394-3733299 

333174 Dunham, L etaL Minus 3764284-3764210 

333188 Dunham, L etaL Minus 382699*3826863 

333214 Dunham, I. etaL Minus 396655*3966437 

333232 Dunham, Letal. Minus 4001551-4001365 

55 333237 Dunham, Letal. Minus 4003326-4003219 

333239 Dunham, L etaL Minus 4095881-4094462 

333255 Dunham, Letal Minus 429788*4297716 

333259 Dunham, Letal Minus 4308769-4306639 

333274 Dunham, Letal Minus 438914*4388954 

60 333290 Dunham, L etal. Minus 4530734-4530554 

333295 Dunham, Letal. Minus 45492904549198 

333296 'Dunham, I. etal. Minus 4550766-4550644 

333310 Dunham, I. etal Minus 4637315-4637232 

333311 Dunham, I. etal. Minus 463793*4637844 
65 333312 Dunham, I. etal. Minus 4638794-4638635 

333313 Dunham, I. etaL Minus 4639397-4839277 

333315 Dunham, I. etal Minus 5405980-5405876 

333318 Dunham, I. elaL Minus 464263*4642564 

333321 Dunham, I. etaL Minus 4649080-4648934 
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333327 Dunham, LetaL 
333335 Dunham, I. eUL 
333337 Dunham, LetaL 
333454 Dunham, LetaL 

333458 Dunham,LetaL 

333459 Dunham, LetaL 
333470 Dunham, LetaL 
333433 Dunham, LetaL 
333438 Dunham, LetaL 
333498 Dunham, L etal. 
333510 Dunham, LetaL 
333546 Dunham,!. etal. 
333561 Dunham, LetaL 
333738 Dunham, L etaL 
333780 Dunham, LetaL 
333783 Dunham, LetaL 
333818 Dunham, LetaL 
333894 Dunham, L etal. 
333897 Dunham, I. etaL 
333900 Dunham, LetaL 
333909 Dunham, LetaL 
333936 Dunham. LeLaL 
333944 Dunham, LeLaL 
334040 Dunham, LetaL 
334154 Dunham, I. etaL 
334178 Dunham, I. eLal. 
334188 Dunham, LeLaL 
334273 Dunham, LetaL 
334282 Dunham, LeLaL 

334285 Dunham, LeLaL 

334286 Dunham, l.etat 
334303 Dunham, L etaL 

334305 Dunham, LeLaL 

334306 Dunham, LetaL 
334320 Dunham, LetaL 

334352 Dunham, L etaL 

334353 Dunham, L etaL 
334359 Dunham,LetaL 
334363 Dunham, LetaL 
334365 Dunham, LetaL 
334399 Dunham, L etaL 
334409 Dunham, LeLaL 
334414 Dunham, LeLaL 
334470 Dunham, LeLaL 
334483 Dunham, LetaL 
334489 Dunham, LetaL 
334498 Dunham, L etal. 

334501 Dunham, LeLaL 

334502 Dunham, L etal. 
334543 Dunham,letaL 
334622 Dunham, LetaL 
334650 Dunham, L etal. 
334680 Dunham, LetaL 
334745 Dunham, L etal. 
334756 Dunham, L eLal. 
334758 Dunham, L eLal. 
334761 Dunham, LeLaL 
334763 Dunham, I. etaL 
334784 Dunham, LeLaL 
334790 Dunham, L eLaL 
334793 Dunham, Letat 
334802 Dunham, LeLaL 
334820 Dunham, LeLaL 
334824 Dunham, LeLaL 
334832 Dunham, LetaL 
334842 Dunham, I. etal. 
334844 Dunham, I. etal. 
334857 Dunham, LeLaL 
334927 Dunham, LetaL 



Minus 4657947-4657828 

Minus 4672656-4672564 

Minus 46779304677841 

Minus 5137007-5136880 - 

Minus 5143942-5143806 

Minus 5144548-5144344 

Minus 5223319*223088 

Minus 46373154637232 

Minus 5404643*404523 

Minus 54059805405876 

Minus 55576265557469 

Minus 5886643*886442 

Minus 5903659*903590 

Minus 7552160-7552084 

Minus 7750387-7750277 

Minus 7751850-7751777 

Minus 7911959-7911762 

Minus 8188855*188709 

Minus 8194390*194284 

Minus 8200268*200122 

Minus 8229639*229477 

Minis 8512805*512564 

Minus 8557051*556936 

Minus 9342995*342934 

Minus 10570714-10570572 

Minus 11755052-11754971 

Minus 11925963-11925834 

Minus 13265608-13265522 

Minus 13285293-13285178 

Minus 13289990-13289793 

Minus 13291759-13291569 

Minus 13454331-13454217 

Minus 13456310-13456209 

Minus 13461157-13461049 

Minus 13496857-13496717 

Minus 13675908-13675828 

Minus 13683722-13683596 

Minus 13728664-13728534 

Minus 13740004-13739812 

Minus 13742078-13741971 

Minus 14186289-14186163 

Minus 14195181-14195075 

Minus 14234033-14233932 

Minus 14389581-14389442 

Mows 14428355-14428281 

Minus 14455428-14454288 

Minus 14483789-14483700 

Minus 14487509-14487356 

Minus 14488605-14488526 

Minus 14834496-14834116 

Minus 15191678-15191609 

Minus 15371251-15371178 

Minus 15520047-15519887 

Minus 16049960-16049653 

Minus 16128678-16128528 

Minus 16132368-16132233 

Minus 16138424-16138319 

Minus 16148136-16148077 

Minus 16294548-16294360 

Minus 16307576-16307509 

Minus 16330748-16330681 

Minus 16413158-16413026 

Minus 16764338-16764249 

Minus 16857777-16857674 

Minus 17173957-17173760 

Minus 17464352-17464181 

Minus 17503891-17503766 

Minus 18488368-18488242 

Minus 19988711-19987853 
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334339 Dunham, L etaL Minus 20131162-20131054 

334951 Dunham, I. etaL Minus 20147708-20147502 

334969 Dunham, L etaL Minus 20188176-20188020 

334972 Dunham, LeLal Minus 20234734-20294811 

S 335050 Dunham, L etaL Minus 20884109-20883951 

335078 Dunham, LetaL Minus 21059529-21059458 

335102 Dunham, L etaL Minus 21313841-21313598 

335105 Dunham, LeLal. Minus 21320563-21320440 

335110 Dunham, LeLal. Minus 21334136-21333811 

10 335111 Dunham, I. etaL Minus 21335946-21335809 

335115 Dunham, LeLal. Minus 21383250-21388146 

335116 Dunham. I. eLaL Minus 21388573-21388414 

335185 Dunham, L etaL Minus 21651593-21651522 

335186 Dunham, L etaL Minus 21656436-21656338 
15 335230 Dunham, LeLal. Minus 21899517-21898678 

335236 Dunham, I. eLaL Minus 21915016-21914870 

335243 Dunham, I. eUL Minus 21933519-21933365 

335249 Dunham, I. eLaL Minus 21950851-21950869 

335258 Dunham, L eLaL Minus 22043431-22043262 

20 335261 Dunham, I. etaL Mows 22083937-22063772 

335276 Dunham, I. eLaL Minus 22154038-22153937 

335279 Dunham, LeLal. Minus 22168834-22168638 

335330 Dunham. I. eLaL Minus 22556589-22556422 

335331 Dunham, L eLaL Minus 2255682322558708 
25 335334 Dunham, LeLal. Minus 22560390-22560136 

335346 Dunham. I. eLaL Minus 22641097-22640918 

335349 Dunham, I. eLaL Minus 22661861-22661271 

335611 Dunham,!. eLaL Minus 25070825-25070706 

335612 Dunham, L elaL Minus 25072328-25072142 
30 335671 Dunham, I. etal. Monus 25358629-25358533 

335676 Dunham, L elaL Minus 25395274-25395152 

335680 Dunham, L elaL Minus 25402437-25402361 

335750 Dunham, L elaL Minus 25732501-25731972 

335752 Dunham, LetaL Minus 25757026-25756890 

35 335755 Dunham, LeLal Minus 25763806-25763747 

335767 Dunham, LeLal. Minus 25819547-25819218 

335774 Dunham, LeLal. Minus 25883733-25883572 

335777 Dunham, LetaL Minus 25885770-25885599 

335778 Dunham, L elaL Minus 25886469-25886334 
40 335797 Dunham, L elaL Minus 25958182-25958030 

335800 Dunham, LeLal Munis 25985373-25985280 

335818 Dunham, L elaL Minus 26323886-26323744 

335834 Dunham, LetaL Minus 26391707-26391530 

335840 Dunham, LetaL Minus 26420596-26420538 

45 335844 Dunham, L elaL Minus 26433427-28433344 

335846 Dunham, I. elal Minus 26438727-26436621 

335856 Dunham, LetaL Minus 26662452-26662346 

335887 Dunham, I. elaL Minus 28939225-26938782 

335888 Dunham, I. elal Minus 26943037-26942820 
50 335889 Dunham, LetaL Minus 26946988-26948901 

335890 Dunham, L elaL Minus 26949087-26948665 

335893 Dunham, L elal Minus 26973898-26973747 

335895 Dunham, LetaL Minus 26975307-26975239 

335896 Dunham, L elaL Minus 26977639-26977558 
55 335900 Dunham, I. elaL Minus 26980354-26980238 

335907 Dunham, L elaL Minus 27013352-27013273 

335943 Dunham, L elaL Minus 27446610-27448378 

335956 Dunham. I. elaL Minus 27653729-27653635 

335959 Dunham, I. elaL Minus 27682313-27682145 

60 335962 Dunham, L elaL Minus 27704276-27704144 

336040 Dunham, I. elal Minus 29036458-29036300 

336044 Dunham, I. eLaL Minus 29043828-29043727 

336047 Dunham, I. elal. Minus 29050617-29050466 

336063 Dunham, I. elal. Minus 29252077-29251969 

65 336143 Dunham, I. elal. Minus 30135948-30135854 

336153 Dunham, LeLal. Minus 30163730-30163610 

336174 Dunham, I. elaL Minus 30241988-30241839 

336223 Dunham, LetaL Minus 30316306-30816195 

336245 Dunham, L elaL Minus 31420569-31420509 
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336274 Dunham, LetaL Minus 32085458-32035303 

336318 Dunham, LetaL Minus 33364452-33334338 

336326 Dunham, LetaL Minus 33567328-33557201 

336339 Dunham, LeUI. Minus 33783479-33798330 

5 336340 Dunham, LetaL Minus 33812069-33811915 

336355 Dunham, LetaL Minus 3387475033874649 

336392 Dunham, LetaL Minus 34015868-34015738 

336393 Dunham, LetaL Minus 34016145-34015951 

336394 Dunham, LetaL Minus 34016457-34016298 
10 336400 Dunham, LetaL Minus 34023437-34023298 

336402 Dunham, LetaL Minus 34024090-34023981 

336413 Dunham, LetaL Minus 34046702-34046576 

336424 Dunham, LetaL Minus 34055549-34055491 

336425 Dunham, LetaL Minus 34058544-34058446 
IS 338437 Dunham, LetaL Minus 34074154-34074090 

336447 Dunham, LetaL Minus 34198207-34197998 

336449 Dunham, LetaL Minus 34204707-34204577 

33S466 Dunham, LetaL Minus 34213195-34213046 

336492 Dunham, LetaL Minus 34255578-34255437 

20 336511 Dunham, LetaL Minus 34277480-34277351 

336512 Dunham, LetaL Minus 34278373-34278275 

336520 Dunham, LetaL Minus 34319184-34319101 

336522 Dunham, LetaL Minus 34320169-34320056 

336524 Dunham, LetaL Minus 34321055-34320921 

25 336527 Dunham, LetaL Minus 34322071-34321966 

336534 Dunham, LetaL Minus 34326797-34326620 

336536 Dunham, LetaL Minus 34327678-34327538 

336542 Dunham, LetaL Minus 34331316-34331183 

336556 DunhanUetal. Minus 34375244-34374907 

30 336557 Dunham, Letal. Minus 34375443-34375341 

336558 Dunham, Letal. Minus 3437582534375698 

336559 Dunham, LetaL Minus 34376430-34376261 

336560 Dunham, LetaL Minus 34376814-34376596 

336561 Dunham, LetaL Minus 34377168-34376928 
35 336597 Dunham, LetaL Minus 7627912-7627757 

336601 Dunham, LetaL Minus 13265853-13265654 

336642 Dunham, LetaL Minus 1304281-1304212 

336645 Dunham, LetaL Minus 1351268-1351168 

336662 Dunham, LetaL Minus 2158060-2157993 

40 336664 Dunham, LetaL Minus 1993558-1893481 

336676 Dunham, Letal. Minus 2022555-2022497 

336684 Dunham, Letal. Minus 2158060-2157993 

336688 Dunham, Letal. Minus 2160698-2160486 

336714 Dunham, LetaL Minus 3094026-3093871 

45 336719 Dunham, LetaL Minus 3331631-3331503 

336736 Dunham, LetaL Minus 40931284093041 

336744 Dunham, LetaL Mams 4333001-4332848 

338786 Dunham, LetaL Minus 54199736419873 

336793 Dunham, Letal. Minus 5631345-5631237 

50 336859 Dunham, Letat Minus 82017586201581 

336863 Dunham, Letal. Minus 83966734396425 

336933 Dunham, I. elaL Minus 11760045-11759981 

336942 Dunham, I. elaL Minus 12027537-12027455 

336960 Dunham, LeLal. Minus 13267243-13267172 

55 336969 Dunham, LeLal. Minus 13725722-13725643 

336971 Dunham, LetaL Minus 13732308-13732221 

337003 Dunham, LetaL Minus 15523541-15523422 

337011 Dunham, I. elaL Minus 16106423-16106080 

337070 Dunham, I. elaL Minus 19034423-19034321 

60 337072 Dunham, LetaL Minus 19077452-19077323 

337086 Dunham,!. elaL Minus 19657011-19656881 

337140 Dunham, LetaL Minus 22649450-22649388 

337193 Dunham, L elaL Minus 24594969-24594874 

337256 Dunham, L elaL Minus 27659956-27659876 

65 337278 Dunham, LetaL Minus 28429017-28428848 

337284 Dunham, L elaL Minus 28491414-28491094 

337293 Dunham, L elaL Minus 28846334-28845873 

337316 Dunham, L elaL Minus 29657129-29656997 

337326 Dunham, Letat Minus 30017199-30017069 
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337382 Dunham, LetaL Minus 31233868-31233579 

337392 Dunham, LetaL Minus 31442311-31442229 

337406 Dunham, LetaL Minus 31864840-31884588 

337412 Dunham, LetaL Minus 31916487-31916312 

S 337419 Dunham, LetaL Minus 32021496-32021170 

337436 Dunham, LetaL Minus 32257869-32257739 

337455 Dunham, LetaL Minus 32434517-32434425 

337509 Dunham, LetaL Minus 33414613-33414498 

337518 Dunham, LetaL Minus 33798750-33796647 

10 337529 Dunham, LetaL Minus 34043668-34043548 

337533 Dunham, LetaL Minus 34193388-34193261 

337539 Dunham, total. Minus 34254490-34254322 

337551 Dunham, L eta). Minus 34524446-34524362 

337553 Dunham, Lata!. Minus 24230-24160 

IS 337591 Dunham, LetaL Minus 1006414-1006184 

337592 Dunham, L etaL Minus 1007791-1007634 

337593 Dunham, tela). Minus' 1009460-1009291 
337607 Dunham, LetaL Minus 1355719-1355637 
337612 Dunham, LetaL Minus 1570235-1570142 

20 337635 Dunham, LetaL Minus 21696904169569 

337824 Dunham, LetaL Minus 4559540-4559266 

337825 Dunham, LetaL Minus 4567155-4567005 
337850 Dunham, LetaL Minus 5077143-5076943 
337854 Dunham, LetaL Minus 5153435-5153272 

25 337913 Dunham, LetaL Minus 61498434149786 

337915 Dunham, LetaL Minus 59227484922690 

337968 Dunham,LetaL Minus 7095797-7095680 

338010 Dunham, LetaL Minus 7754282-7754184 

338012 Dunham, LetaL Minus 7761421-7761351 

30 338017 Dunham, LetaL Minus 7884521-7864401 

338065 Dunham, LetaL Minus 7235048-7234950 

338094 Dunham, LetaL Minus 9595602-9595440 

338129 Dunham, LetaL Minus 10915338-10915237 

338132 Dunham, LetaL Minus 10989617-10989530 

35 338150 Dunham, LetaL Minus 11478551-11478355 

338157 Dunham, LetaL Minus 11731444-11731375 

338195 Dunham, LetaL Minus 13484103-13483972 

338255 Dunham, LetaL Minus 15242294-15242231 

338276 Dunham, LetaL Minus 16109555-16109398 

40 338431 Dunham, LetaL Minus 19747608-19747498 

338448 Dunham, I. etaL Minus 20151152-20151054 

338451 Dunham, LetaL Minus 20174286-20174193 

338477 Dunham, LetaL Minus 20821897-20821838 

338534 Dunham, LetaL Minus 21771238-21771170 

45 338682 Dunham, LetaL Minus 24800712-24800461 

338684 Dunham, LetaL Minus 24827522-24827428 

338689 Dunham, I. etaL Minus 24893073-24892972 

338695 Dunham, I. etaL Minus 25104153-25104016 

338825 Dunham, I. etal. Minus 27664798-27664712 

50 338842 Dunham, LetaL Minus 27824238-27824079 

338893 Dunham, LetaL Minus 28491807-28491631 

338904 Dunham, LetaL Minus 28766345-28766253 

338935 Dunham, I. etaL Minus 29071537-29071461 

339022 Dunham, LetaL Minus 30523414-30523289 

55 339034 Dunham, L etal. Minus 3062160340621422 

339190 Dunham, LetaL Minus 3240310342402985 

339212 Dunham, LetaL Minus 32494335-32494210 

339213 Dunham, I. etaL Minus 3249659042498440 
339216 Dunham, LetaL Minus 32504250-32504109 

60 339233 Dunham, LetaL Minus 32751331-32751238 

339258 Dunham. LetaL Minus 32934756-32934615 

339262 Dunham. LetaL Minus 3297125842971080 

339263 Dunham. LetaL Minus 32974634-32974452 
339265 Dunham, LetaL Minus 3297594342975808 

65 339338 Dunham, LetaL Minus 33468728-33468606 

339396 Dunham, I. etaL Minus 3401730644017205 

339400 Dunham, I. etal. Minus 3404502444044940 

339425 Dunham, LetaL Minus 3440791144407798 

325207 6552430 Plus 140049-140170 
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329568 3962490 
329517 3983513 
325313 5866865 
325327 5866875 
5 325317 5866878 
325257 5866895 
329632 6729060 
325371 5866920 
325375 5866920 
10 325378 5866920 

325469 6017034 

325470 6017034 
325576 6552443 
325505 6682451 

15 325543 6682452 

329635 5302817 

329636 5302817 
325593 5866992 
325675 5867014 

20 325704 5867028 
325682 6138923 
325785 6381957 
325666 6469822 
325818 6682490 

25 329777 6002090 
329768 6015501 
329759 6048280 
329731 6065783 
329687 6117858 

30 329676 6272128 
329667 6272129 

329669 6272129 

329670 6272129 
329641 6468233 

35 329791 6469354 
325826 5867048 
325829 5867052 
329888 6067149 
329893 6525313 

40 329899 6563505 
325988 5867064 
325855 5867067 
325999 5867073 
326001 5867073 

45 325886 5867087 
325882 5867087 
325905 5867104 
325922 5867122 
325937 5867132 

50 325960 5867147 
325961 5867147 

325838 6552452 

325839 6552452 

325840 6552452 
55 325844 6552453 

325870 6682492 
329984 4646193 
329976 4878063 
329935 6165200 

60 329916 6223624 
330021 6671889 
330024 6671908 
330028 6671908 
326033 5867178 

65 326036 5867178 
326056 5867184 
326116 5867193 
326122 5867194 
326138 5867203 



Plus 


38331 -38750 


Minus 


53197-53269 


Minus 


27335-28192 


Plus 


75189-75264 


Minus 


156551-156649 


Plus 


10887-10955 


Plus 


192313-193017 


Minus 


1035422-1035536 


Minus 


1165503-1165810 


Minus 


1187981-1188167 


Plus 


286823-286991 


Plus 


287578-287863 


Minus 


137769-137894 


Minus 


240852-240946 


.Plus 


151873-152057 


Minus 


62522-62822 


Minus. 


64969-65078 


Minus 


469726-469860 


Plus • 


955517-955711 


plus 


156198-156387 


Phis 


370518-370763 


Plus 


61849-62003 


Plus 


16769-16857 


Minus 


.120278-120559 


Minus 


191389-191479 


Plus 


118315-118422 


Minus 


37647-37730 


Phis 


158772-158900 


Minus 


2216522288 


Minus 


142207-142359 


Plus 


101355-101745 


Plus 


131223-131291 


Plus 


131351-131495 


Minus 


105995-106107 


Minus 


131982-132089 


Minus 


46381-46458 


Phis 


232674-233060 


Minus 


37227-37473 


Minus 


166123-166791 


Minus 


111058-111783 


Phis 


17349-17606 


Plus 


276141-276251 


Plus 


149115-149192 


Plus 


155223-155348 


Plus 


194694-194915 


Minus 


81784347 


Phis 


78779-78876 


Minus 


329063-329134 


Minus 


152633-152902 


Minus 


162506-162635 


Minus 


165106-165209 


Phis 


171451-171532 


Plus 


181984-182037 


Plus 


184380-184547 


Minus 


14188-14332 


Phis 


228209-228297 


Minus 


139780-139890 


Minus 


6258442691 


Minus 


6905949127 


Plus 


3639647195 


Phis 


120938-121032 


Minus 


1005-1270 


Minus 


3001540144 


Phis 


3726147333 


Minus 


120215-120273 


Minus 


181553-181690 


Plus 


45548-45604 


Plus 


144397-144683 


Minus 


179374-179436 
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328145 5887204 
326180 6887211 
326201 6867216 
326207 S867222 
5 326228 5867230 
326233 6867232 
326238 5887260 
326241 6887260 
326243 5887261 

10 x 326251 5667263 
326268 5867267 
326124 5916395 
326339 6056311 
330049 4567182 

IS 326358 5867293 
326365 5867297 
326379 5867327 
326382 5867327 
326390 5867340 

20 326424 5867369 
326453 5867399 
326472 5867404 
326492 5887422 
326533 5867441 

25 330117 6015201 

330115 6015202 

330116 6015202 

330095 6015278 

330096 6015278 
30 326644 5867559 

326713 5867595 
326745 5867611 

326752 5887615 

326753 5867616 
35 326598 5867634 

326667 6552455 
326855 6552460 
326812 6682504 
327005 5667664 

40 327008 5867664 
326896 5867680 
326904 5867684 
326951 6004446 
326941 6004446 

45 326943 6004446 
326928 6456782 

326958 6469836 

326959 6469836 
327039 6531965 

50 327127 6682520 
330158 6580367 
327204 5867447 
327208 5867447 
327268 5867462 

55 327277 5867473 
327289 5867481 
327296 5867492 
327237 5867544 
327145 5887548 

60 327333 5902477 
327335 5902477 
327343 6017017 
327350 6249563 
327358 6552411 

65 327360 6552411 
327409 5867750 
327424 5867751 
327430 5687754 
327470 5867772 



Minus 


5259932814 


Minus 


182758-183222 


Minus 


166168-166959 


Phis 


48139-48219 


Plus 


5264432705 


Plus 


124786-124863 


Phis 


6428234338 


Minus 


181648-181916 


Plus 


123838-123978 


Minus 


8271632822 


Plus 


122114-122765 


Plus 


407102-407560 


Minus 


164637-165251 


Minus 


314662315210 


Phis 


91223195 


Minus 


9663036764 


Plus 


3229932402 


Moms 


5042030503 


Minus 


108814-110592 


Minus 


168329-16B409 


Plus 


8622236423 


Plus 


293739-293940 


Plus 


120768-120991 


Minus 


£32153332280 


Minus 


7340-7680 


Plus 


11403-11677 


Plus 


12109-12418 


Plus 


15343-15814 


Plus 


49370-49458 


Phis 


42684-42819 


Plus 


121511-121798 


Plus 


127130-127318 


Minus 


1214-1562 


Plus 


12454-12511 


Pius 


6895539014 


Phis 


142311-142441 


Minus 


111390-111463 


Plus 


189811-189941 


Phis 


610847310907 


Phis 


928737328811 


Minus 


12032-12122 


Minus 


92804606 


Plus 


193812-193998 


Phis 


6201832896 


Minus 


8924239427 


Minus 


291007-291219 


Minus 


42952-43082 


Minus 


4315943301 


Phis 


694486394998 


Phis 


41925-42083 


Phis 


8196632456 


Phis 


165135-165239 


Plus 


180805-180864 


Minus 


8240032615 


Minus 


165616-165715 


Plus 


4929649536 


Plus 


76273166 


Minus 


5970239813 


Minus 


40482-40551 


Minus 


141448-141609 


Minus 


142979-143124 


Minus 


12288-12395 


Minus 


41890-41985 


Minus 


38023950 


Minus 


62553422 


Minus 


52949-53011 


Phis 


160442-160598 


Plus 


1320-1403 


Phis 


150910-150973 
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327460 6004455 
327438 6017023 

327509 6117815 

327510 6117815 
5 327512 6117815 

327535 6525279 
330163 6042042 
330171 6648220 
327579 5867824 

10 327672 5867843 
327629 5867872 
327640 5867890 
327649 5867899 
327612 6525283 

IS 327716 6525284 
327801 5867924 

327762 5867961 

327763 5867981 
327776 6867964 

20 327822 5867968 
327823 5867963 
327807 5867963 
327845 6531962 
330228 6013527 

25 330190 6165182 
328122 5868031 
328132 5868038 
328159 5868065 
328168 5868071 

30 328175 5866073 
328217 586809S 

327865 5868130 

327866 5868131 
327870 5868131 

35 327879 6868142 
327902 5868158 
327918 5868165 
327934 5868184 
327959 5868210 

40 327976 5868212 
326020 5902482 
328042 5902482 
328008 5902482 
330301 2905862 

45 330299 2905881 
328274 5668219 
328595 6868224 
328591 5668227 
328668 5868254 

50 328677 5868256 
328687 5868262 
328706 5868270 
328711 5868271 
328730 5868289 

55 328732 5868289 
328734 5868289 
328752 5868298 
328755 5868301 
328761 5868302 

60 328775 5668309 
328784 5868309 
328787 5868309 
328809 5868327 
328829 5868337 

65 328280 5868352 
328311 5868371 
328318 5868373 
328323 5868373 
328348 58683B3 



Plus 


175245-175343 


Minus 


42178-42283 


Minus 


54882-55053 


Minus 


56824-56944 


Plus 


176256-176325 


Plus 


19105-19175 


Minus 


20321-20385 


Plus 


110889-111575 


Minus 


37229-38335 


Minus 


6964949740 


Plus 


49692-49811 


Plus 


84484566 


Plus 


205871-205927 


Plus 


27474924 


Plus 


8612346183 


Plus 


2323943348 


Minus 


5030340439 


Plus 


229347-229476 


Minus 


164308-164486 


^ Minus 


168886-169633 


Minus 


170359-170433 


Plus 


3374543811 


Plus 


193402-193549 


Minus 


3719-3787 


Plus 


3610346243 


Plus 


158474-158656 


Minus 


126737-126839 


Minus 


5295743162 


Plus 


6032140479 


Plus 


208471 


Minus 


3742-4362 


Plus 


6150342205 


Minus 
Plus 


28934046 
5355843757 


Minus 


77722-77793 


Minus 


133339-133467 


Plus 


647530447591 


Plus 


41830-42036 


Minus 


46497-46682 


Minus 


349301449409 


Minus 


556386456652 


Minus 


1985085-1986626 


Plus 


296663-297151 


Minus 


4420-5781 


Minus 


1020-1382 


Minus 


3124441439 


Plus 


148738-148987 


Minus 


237647-237726 


Minus 


10888-10984 


Minus 


5870848950 


Plus 


624479424585 


Phis 


165501-165614 


Minus 


97797-97990 


Plus 


80684214 


Plus 


3743747550 


Plus 


5055940747 


Minus 


114911-115087 


Minus 


145959-146446 


Minus 


239308439412 


Pius 


12845-12920 


Minus 


74523-74604 


Plus 


135772-135963 


Plus 


9179241849 


Plus 


3630946630 


Plus 


160563-160631 


Minus 


170560-170826 


Plus 


414945415620 


Minus 


1080089-1080235 


Minus 


260272-260379 
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328377 5888390 
328436 5888417 
328504 6888471 
328508 5868471 
328522 5868477 
328525 5868482 
328541 5868488 

328662 6004473 

328663 6004473 
328803 6004475 
328304 6004478 
328927 5868500 
328936 5868500 
328939 6004481 
328941 6456765 
328948 6456765 
328968 6456775 
330316 6007575 

330350 3056622 

330351 3056622 
330348 4544475 
329034 5868561 
323046 5868569 
329053 5868574 
329188 5868711 
329237 5868729 
329276 5868762 
329333 5868806 
329376 5868859 
329384 5868869 
329140 6017060 
329317 6381976 
329319 6381976 
329129 65B8026 
329373 6682537 
329412 6682553 
329424 5868879 
329446 5868886 
329449 5868886 



Plus 16947-17023 

Phis 203760203904 

Phis 47064-47217 

Plus 6071640830 

Plus 1972307-1972452 

Plus 12387-14313 

Plus 130958-131050 

Phis 1184773-1184855 

Phis 1185279-1186834 

Minus 291718-291948 

Minus 38844952 

Minus 428829-428893 

Minus 1352202-1352259 

Minus 131139-131320 

Minus 9817-9885 

Plus 28227-28413 

Plus 117442-118283 

Minus 119761-119931 

Minus 26413-26820 

Minus 2752247614 

Minus 19855-18962 

Minus 3281942939 

Plus 18971-19030 

Plus 426453426541 

Minus 13108-13225 

Plus 133238-133339 

Minus 222629-222709 

Phis 392666492746 

Plus 5235642694 

Minus 116524-116662 

Plus 290842-290905 

Phis 614823415209 

Phis 721390-721470 

Phis 144569-144712 

Minus 3895049301 

Minus 6894849041 

Plus 362198462344 

Phis 8477644899 

Plus 9769747771 
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TABLE 14: shows genes, including expression sequence tags, down-regulated in prostate 
tumor tissue compared to normal prostate tissue as analyzed using Afrymetrix/Eos Hu02 
GeneChip array. Shown are the ratios of "average" normal prostate to "average" prostate 
cancer tissues. 



Pkey. 
ExAocn: 



R1: 



Unique Eos probeset ktenffler number 

Exemplar Accession number, Genbank accession numbe 

lindane number 

UnigenegeneStle 

Background subtracted normal prostata : prostate tumor tissue 



Ptey ExAccn UnlgenelD 



331328 
320875 
300994 
323461 
301015 
319419 
323486 

324882 
330569 
330126 
316265 
323045 
320668 



312614 
314790 
309979 
314236 
329192 
324307 



314921 
315840 
332776 
313533 
303494 
317490 
332546 
334719 
300679 
311811 
315310 
.312871 
324715 
313870 
321453 
316160 
313833 
315850 
303124 
323346 
301383 
324513 
303480 
323591 
313603 
317863 
312381 
317514 
319750 



AA281133 

D60641 

AK51936 

AM18762 

AA947682 

AA543096 

C05278 

AW419080 
U57796 

AA737400 

AA1489S0 

RS8399 

AA465192 

AI766732 

AW341754 

AW452118 

AA743396 

AA627642 

AW500106 

AW452382 

AA679001 

AA034364 

AW298141 

F30712 

AI627358 

D84454 

AA813958 

AI625304 

AW511298 

H86747 

AI739168 

AW206435 

N50080 

AW197887 

AA766825 

AW270550 

AF161350 

AL134932 

AA913S91 

AW501678 

AA331906 

AA301270 

AW468119 

AI733395 

R42049 

AW451570' 

AA621606 



H&88808 ESTs 
Hs.131921 ESTs 
Hs.146298 ESTs 
Hs.190044 ESTs 

H&217173 ESTs; Weakly strrflar to Chain A; Cdc42hs-Gdp Complex [Rsapiens] 

Hs.13648 ESTs; Highly similar to mitogen-induced [Mmusculus] 

Hs.166800 ESTs; Moderately similar to [PYRUVATE DEHYDROGBlASE(UPOAMIDE)] 

KINASE ISOZYME 4 PRECURSOR [Rsapiens] 
H&250645 ESTs 
H&57679 zinc finger protein 192 

CH21j32gil6093735 
Hs.142230 ESTs 
Hs.188836 ESTs 
Hs.146217 ESTs 
Hs.16514 ESTs 
Hs501194 ESTs 
Hs.189305 ESTs 
H&257533 EST 
Hs.189023 ESTs 

CRXJis 91586871 6 

H&4994 transducer oi ERBB2; 2 (TOB2) 

EST cluster (not in UniQene) with exon hit 
H&257564 ESTs 
Hs.192221 ESTs 

Hsi56551 ESTs; Weakly sMar to UU AU) CLASS B WARNING ENTRY llll [H^apiens] 
Hs.157975 ESTs 

EST cluster (not In UnlGene) wISi exon hK 
Hs.148367 ESTs 

H&21899 sohrle carrier family 35 (UDP-galactose transporter); member 2 

CH22_FGENES.421_30 
H&207727 ESTs; Moderately similar to KIAA0071 [H.saplens] 
Hs.190312 ESTs 
HS356067 ESTs 
H&227602 KIAA1 115 protein 

EST cluster (not In UnlGene) 
Hs.148057 ESTs 
Hs.1 17827 ESTs 
H&253353 ESTs 

EST cluster (not in UnlGene) 
Hs.1 16957 ESTs 

EST cluster (not in UnlGene) with exon hit 
Hs.143607 ESTs 
Hs-126480 ESTs 
Hs.164577 ESTs 

EST cluster (not In UnlGene) with exon hi 

EST duster (not in UnlGene) 

EST cluster (not in UnlGene) 
Hs.129124 ESTs 
Hs.195473 ESTs 
Hs.126850 ESTs 
Hs.1 17856 ESTs 

271 



R1 

1853 
1455 
12.17 
1055 
1W7 
9.2 
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8 

7.88 

7£ 

7.7 

7.64 

7.4 

7.15 

7 

6.83 

6.74 

6.49 

6.1 

559 

5.82 

5.8 
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5.43 

SA 

5.35 
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5.19 

5.11 

4.97 
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4.78 

4.63 

4.58 

4.53 

4.46 

44 
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428 
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42 
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322520 755958 ESTc(uster(notlnUniGene) 4 

314754 AW026781 Hs.134374 ESTs 4 

316088 AB90652 H&208973 ESTs 4 

318473 AI339333 Ks.146883 ESTs 3.96 

5 307848 AB84186 EST singleton (not in UnlGene) with exon hit 195 

300730 AW449204 Hs557125 ESTs 194 

303034 W60843 Hs51570 ESTs 353 

3246S8 AI679131 Hs50i424 ESTs 35 

324674 AA541323 Hs.115831 ESTs 358 

10 300547 N53442 Hs.143443 ESTs 353 

316100 AW203986 Ha213003 ESTs 3.79 

314801 AA481027 Hs.127336 ESTs; Weakly similar toORF YGR245C [Sxerevisfae] 3.75 

320856 D59945 EST duster (not In UniGene) 3.74 

313188 AB39702 Hs.179573 cdkgen;typel;a!pha2 3.73 

15 314187 AA804409 Hs.1 18820 ESTs 3.73 

311826 AA765470 Hs.122826 ESTs 3.7 

302358 D81150 EST duster (not In UnlGene) with exon hit 3.68 

311441 Z38720 Hs.151014 ESTs 3.66 

321914 AA011603 EST duster (not in UnlGene) 359 

20 332216 H35082 Hs.1 02332 EST 352 

324771 AA631739 EST cluster (not in UnlGene) 35 

323691 AA317561 EST cluster (not in Uro'Gene) 3.49 

303525 AW516519 Hs.1 15130 ESTs 3,47 

309709 AW242630 EST singleton (not in UnlGene) wBh exon hit 3.46 

25 300038 AFFX control: MurlW 358 

316526 AI088192 Hs.135474 ESTs; Weakly similar to ATP-DEPENDENT RNA HEUCASE A [H^aplans] 356 

313029 AA731520 Hs.170504 ESTs 355 

304356 AA1960Z7 Hs.195188 glyceraldehyd8-3iJhosphat9 dehydrogenase 354 

314810 AI948688 Hs.191805 ESTs 353 

30 329815 CH.14_p2gi]6624888 352 

314949 AI7453B7 Hs539124 ESTs 351 

300598 N53574 Hs.158932 ESTs 35 

329218 CHXJisgi|5868726 358 

315706 AW440742 Hs.155556 ESTs 358 

35 303751 AW503637 EST duster (not in UnlGene) wift exon hit 355 

307783 AI347274 EST singleton (not In UniGene) with exon hS 355 

321414 AA324975 Hs.128993 ESTs; Weakly similar to KIAA04S5 protein pisapiens] 355 

312187 AA700439 Hs.1 88490 ESTs 355 

334061 CH22_FGENES527_14 353 

40 336036 CH22_FGENES.678_7 353 

321477 H67818 Hs522059 ESTs 351 

315760 AW139383 H&245437 ESTs 35 

316733 AA811713 Hs.163222 ESTs 35 

300855 AW235248 Hs.79828 ESTs 35 

45 323611 AA304986 Hs.145704 ESTs 3.19 

314138 AA740618 EST duster (not ft UnlGene) 3.17 

316774 AA814859 EST duster (not in UniGene) 3.16 

308884 AI833131 Hs.179100 ESTs 3.11 

331317 AA258222 Hs57757 ESTs 3.1 

50 317221 AI98953B Hs.191074 ESTs &08 

316386 AA749062 Hs.180285 ESTs 3.08 

321040 H26953 EST duster (not in UniGene) 108 

308828 AI824829 EST singleton (not In UniGene) with exon hit - 3.08 

300778 AA236233 Hs.188716 ESTs 3.07 

55 316667 AW015940 HS532234 ESTs 3.07 

324614 AW503101 EST duster (not in UniGene) 3.07 

316468 AW293046 Hs555158 ESTs 357 

300671 AK39706 Hs.189886 ESTs 3.06 

314301 AW297987 Hs.188181 ESTs 3.05 

60 312335 AW043620 Hs536993 ESTs 353 

322957 AA247755 EST duster (not In UniGene) 351 

316848 AA830053 Hs.126798 ESTs 351 

313473 AA009660 HS551948 ESTs; Moderately similar to T07D3.7 [C.elegans] 259 

318518 T27119 EST duster (not in UnlGene) 258 

65 313383 AI076370 Hs.134037 ESTs 257 

331389 AA458637 Hs.162207 ESTs 256 

304257 AA053294 EST singleton (not In UniGene) wflh exon hB 255 

309917 AW340014 EST shgleton (not in UrdGene) with axon hit 255 

319661 H08035 Hs51398 ESTs; Moderately similar to PUTATIVE GLUCOSAMINE-6-PHOSPHATE 
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321253 
321193 
332864 
300Q27 

324330 
320014 
333916 
318885 
318146 
323348 
305703 
335862 
317672 
323416 
312652 
324034 
319761 
317013 
317383 
314659 
312479 



311824 
321992 
316074 



312071 
312684 



322139 
304168 
325602 
319885 
300611 
316854 
318208 
331623 
324616 
304938 
314912 
300767 
313463 
320600 
301180 
324825 
300336 

317850 
339047 
324580 
321142 
319478 
300793 
313733 
326505 
314987 
303114 
318709 
312878 
329224 
328018 
323231 
312887 
315183 



AB994B4 
M149508 



M11507 „ 
AAB84766 
AA137114 

Z43272 
AI040125 
AA233056 
AA825148 

AW2O5409 

A1610397 

AI419909 

AA382603 

R84237 

AA864468 

AA913887 

AW277121 

AI950844 

AW293826 

C06003 

AW517542 

AW296076 

AA683529 

AW294Q20 

AA062971 

H53744 

H77679 

R59096 

N75450 

AAB31215 

AI091458 

R38715 

AI823999 

AA614308 

A1431345 

AW193486 

AI057369 

AA135S65 

AB08989 

AA704457 

AW292417 

N29974 

AA492588 

AI817933 

R06841 

AE48571 

AA836116 

AW015506 
AF090948 
H24244 
AE09108 



ISOMERASE(Hsapl8ns] 
EST cluster (not In UniGene) 
H&103288 ESTs 

CH22_FGENES.28_4 



Hs.170291 



Hs.150521 
Hs.191518 
H&21229 

Hs.127748 
Hs.159560 
Hs.160994 



Hs.135646 
Hs.126511 
HS254881 
Hs.128738 

H&250610 
Hs.1 16456 
H&208382 

Hs.143119 
Hs.1 17721 
Hs.181161 



313240 
316697 



AA324437 

AW157377 

AW136134 

A1479011 

AI743261 

AW293174 



Hs.136698 

H&159066 
H&134559 
Hs.153529 
Hs.162000 

Hs.161784 
Hs.136525 
Hs.122536 
Hs.250739 
Hs.156939 
Hs.255738 
Hs355074 



H&209584 
Hs.1 86837 

Hs.130730 

Hs.240763 
Hs.1 43946 



Hs.177230 
Hs.132910 
Hs.220277 
Hs.170783 
Hs.131860 
H&252627 



AFFX control: transferrin receptor 
EST duster (not in UniGene) 
ESTs 

CH22_FGENES.298_5 
EST cluster (not In UniGene) 
ESTs 
ESTs 

F-box protein Fbwlb 

CH22_FGENES.629_7 

ESTs 

ESTs 

ESTs 

EST duster {rot In UniGene) 

EST cluster (not in UniGene) 

ESTs 

ESTs 

ESTs 

ESTs; Wealdy similar to non-lens beta gamma-crystallin Bke protein [H^apians] 
.CH22_FGENES.7_10 
ESTs 
ESTs 
ESTs 

EST singleton (not In UniGene) with axon hit 

ESTs 

ESTs 

ESTs; Weakly simBar to INHIBITOR OF APOPTOSIS PROTEIN 1 [MjiuiscuIus] 
EST cluster (not in UniGene) 
EST singtaton (not in UniGene) with axon hit 
CH.13_hsgi]586B934 
ESTs 

EST cluster (not In UniGene) w8h axon tut 



ESTs 

Homo sapiens done 24540 mRNA sequence 
ESTs 

EST singleton (not in UniGene) wffli axon hit 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Moderately simOar to gag |H.sapiens] 

ESTs; Moderately stntiar to high-risk human papilloma viruses E6 

oncoproteins targeted protein E6TP1 alpha ptsaplens] 

EST duster (not in UniQene) 

CH22_OA59H16.GENSCAR28-7 

EST duster (not In UniGene) 

ESTs 

EST duster (not in UniGene) 
ESTs 

EST duster (not in UniGene) 

Cai9_hsgiI5867435 

ESTs 

EST cluster (not in UniGene) with exon hit 
ESTs; Weakly sMar to /prediction 
ESTs 

CHX-hsgi|58S8728 

CHju6Jlsgi|5902482 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 



235 
233 
233 
232 

231 

238 

238 

238 

237 

237 

235 

234 

233 

232 

231 

231 

231 

23 

23 

2.78 

2.78 

2.77 

2.75 

2.75 

2.73 

2.73 

2.73 

2.73 

2.72 

2.72 

2.72 

2.72 

2.71 

2.71 

2.71 

2.69 

2.68 

2.68 

238 , 

237 

2.67 

2.67 

2.65 

235 

235 

235 

234 

234 

234 

233 

232 

232 

2.61 

23 

2.6 

23 

239 

238 

237 

236 

236 

235 

235 

235 

234 

234 

233 
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313S66 


AI807551 


Hs.189061 


ESTs 


ZS3 


331263 


AA015718 




ze31a12j1 Soares retina N2b4HR Homo sapiens cDNA done 
IMAGE36574 3", mRNA sequence . 


£51 


310683 


AW056233 


Hs.160870 


ESTs 


2£ 


302566 


AA085996 


H&24B572 


Human PAC clone DJ404F18 from Xq23 


2S 


302697 


AJ001408 




EST duster (not in UriIGene) with exon hB 


ZJS 


308362 


A1613519 




EST stngteton (not in UniQene) wffli exon hit 


£49 


322347 


AF086538 




EST duster (not In UnlGene) 


249 


316240 


AA974253 


Hs.120319 


ESTs 


249 


323208 


AA203415 


Hs.138200 


ESTs 


248 


321643 


W76005 


H&32094 


ESTs 


2.48 


330723 


AA243617 


H&31082 


ESTs; Highly simBar to db83 [Rnorveglcus] 


248 


323455 


AA256675 


H&20043B 


ESTs; Weakly similar to atypical PKC specific binding protein [Rjiorvegicus] 


2A7 


308383 


AI624497 




EST singleton (rrt in UniGene) with axon htt 


2.47 


328744 






CH.07_hsgil5868290 


247 


332344 


W45574 


Hs252497 


ESTs 


247 


328121 






CHD6_hsfllj5858031 


2-47 


321915 


AI670955 


Hs500151 


ESTs 


246 


314954 


AA521381 


Ha187728 


ESTs 


245 


302821 


AA188868 


Hs.173933 


ESTs; Weakly similar to NUCLEAR FACTOR 1/X(H.sapiens] 


245 


329454 






CH.Y_hsgi|5868887 


245 


336605 






CH22_FGENES.420_4 


245 


300664 


AM44628 


H&256809 


ESTs 


2.44 


323362 


AL135067 


Hs.1 17182 


ESTs 


2.44 


300024 


M10098 


AFFX control: 18S ribosomal RNA 


2.44 


325026 


AI671168 


Hs.12285 


ESTs 


243 


324510 


AI148353 


H&120849 


ESTs 


243 


313389 


AI765182 


Hs.1 19903 


ESTs 


243 


301309 


M78276 


H&255917 


ESTs 


2.43 


313570 


AA041455 


H&209312 


ESTs 


243 


316504 


AW135854 


Hs.132458 


ESTs 


242 


319401 


R01342 




EST duster (not In UniGene) - 


2.42 


312827 


AI744361 


Hs.205591 


ESTs; Weakly similar to zinc linger protein Png-1 [Mjnuscutus] 


242 


327871 






Ca06_hsgq5868131 


241 


337173 






CH22.FGENES.565-3 


2.41 


302948 


AA465635 




EST duster (not bi UniQene) win exon hit 


2.41 


324303 


AL1 18754 




EST duster (not in UniQene) 


2.4 


315527 


AI791138 


Hs.116768 


ESTs 


2.4 


315979 


AA830515 


H&222917 


ESTs 


24 


331310 


AA253351 


Hs.44439 


STAT Induced STAT inhibitor-4 


2.4 


321095 


AA017595 


H&32844 


ESTs 


2.4 


308561 


AI701559 




EST singleton (not in UniGene) whti exon hit 


2.39 


313035 


N36417 


Hs.144928 


ESTs 


2.37 


322114 


AAS43791 


Hs.191740 


ESTs 


237 


313671 


W49823 


Hs.145553 


ESTs 


237 


303211 


AA099548 


Hs.191436 


ESTs; Highly similar to OVI11180244 [Haptens] 


237 


301256 


AA932948 




EST duster (not h UniGene) with exon hit 


236 


338165 






CH22_EMAC0055W.GENSCAN.212-3 


2.36 


324692 


AA557952 




EST duster (not In UniGene) 


2.35 


318587 


AA779704 


Hs.168830 


ESTs 


2.35 


312378 


R41582 


Hs.109219 


retinal degeneration B beta 


2.35 


318625 


T48446 


Hs.193162 


ESTs 


235 


305181 


AAS63726 


Hs.1 16922 


EST 


235 


300815 


AA286678 




EST duster (not In UniGene) with exon hit 


2.34 


324063 


AW292740 


HS254815 


ESTs 


2.34 


315859 


AAS82305 


Hs.133268 


ESTs 


233 


305092 


AAS42912 




EST singleton (not In UniGene) with exon hit 


233 


306598 


AW00320 




EST singleton (not in UniGene) vrith exon hit 


233 


300307 


AI651016 


Hs246311 


ESTs 


233 


321348 


Z49979 




EST duster (not In UniGene) 


233 


325112 


AI903770 


Hs.124344 


ESTs 


232 


336679 






CH22_FGENES.43-7 


232 


321383 


AJ002574 




EST duster (not in UniGene) 


232 


337357 






CH22_FGENES.73W 


2J31 


300680 


AW468066 


Hs257712 


ESTs; Weakly similar to K1AA0986 protein [H^apiens] 


23^ 


327120 






CH21_hsgil6531970 


22\ 


302761 


AW250553 




EST cluster (not In UniGene) wffli exon hit 


23 


312132 


AI475490 


Hs.170577 


ESTs 


23 


315639 


AA827652 




EST duster (not in UniGene) 


23 
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312189 T9S594 Hs.187435 ESTs 23 

308537 AAS91705 EST singleton (not h UniGene) with exon hit ZZ 

327061 C(i21Jwgl|6531965 2J 

315391 AA759098 Hs.192007 ESTs 

322384 AISS8S46 H&33862 ESTs 229 

323206 AA203339 H&220750 ESTs 229 

318110 AI680915 H&201379 ESTs 228 

335250 CH22_FGENESi16_11 228 

331696 238907 Hs£l662 WAA0888 protein 2-28 

318327 AW294013 Hs200942 ESTs 228 

324980 AA969121 Hs254296 ESTs 228 

319429 AI608881 Hs.11482 ESTs; Highly similar to juncSonal adhesion molecule [H.saplens] 228 

310601 AI970543 Hs.192605 ESTs 228 

318905 Z43395 EST cluster (not to UniQene) 228 

323442 AA252753 Hs.164039 ESTs 227 

304428 AA342250 Hs.99819 ubiquffln specfflc protease 16 227 

313352 AVK92127 Hs.144758 ESTs 227 

316491 AA766025 Hs238794 EST 227 

317751 AI697668 KS2Q2241 ESTS 226 

314136 AA229781 Hs221962 ESTs 226 

306665 AI0O4614 Hs.130577 EST 226 

303946 AW474196 Hs221604 ESTs 225 

313435 AA769123 EST cluster (not In UniGens) 225 

317679 AA968799 Hs.150289 .ESTs 225 

322370 AA330095 EST cluster (not In UniGens) 225 

306620 AI000929 EST singleton (not in UniGens) with exon hit 224 

329109 CHJLhsglI5868626 224 

311043 AI871209 Hs.177128 ESTs 224 

300228 AI4S8372 Hs.158748 ESTs; Weakly similar to synapsh lb [Mmusculus] 224 

307223 AI193698 Hs.184776 ribosomal protein L23a 224 

309023 Al 888045 EST singleton (not In UnlGene) with exon hit 223 

310749 AI493675 Hs.170332 ESTs 223 

316769 AI914939 Hs212184 ESTs 222 

320409 AA356195 EST cluster (not In UnlGene) 221 

333149 CH22_FGENES.87_8 221 

324951 M86125 Hs.137487 ESTs 221 

321939 AI791617 Hs.145068 ESTs 22 

320594 A1863952 Hs.169436 arginySransferase 1 22 

320722 R67430 Hs.172787 ESTs 22 

321781 D78667 EST cluster (not in UnlGene) 22 

328903 CHj08Jis g^5868514 22 

303889 T19204 EST duster (not in UnlGene) with exon hfl 22 

325045 T08845 EST cluster (not In UnlGene) 22 

312828 AI865455 Hs211818 ESTs; Moderately similar to Ml ALU SUBFAMILY J WARNING ENTRY Ml [H^apiens] 2.19 

335109 CH22_FGENES.494_15 2.18 

330878 AA131471 Hs.71440 ESTs 2.18 

311289 AB71362 Hs23194S ESTs 2.18 

304608 AA513456 EST singleton (not in UniGene) with exon hit 2.18 

337393 CH22_FGENES.747-4 2.18 

332812 CH22_FGENES.7_14 2.18 

327665 Ca04_hsgl]5867839 2.18 

314581 AW504859 HS237849 ESTs 2.17 

326508 CH.19_hsgi|6682496 2.17 

301242 AW161535 Hs258803 ESTs 2.17 

312780 AI765651 Hs.172900 ESTs 2.17 

315954 AW276B10 Hs254859 ESTs 2.16 

311179 AI880843 Hs223333 ESTs 2.16 

315320 AI084182 Hs.186895 ESTs 2.16 

313017 AI015203 Hs.118015 ESTs 2.16 

312430 AW139117 Hs.117494 ESTs 2.15 

300864 AA406539 Hs.190958 ESTs 2.15 

314753 AA463262 EST cluster (not in UniGene) 2.15 

322574 AF156548 EST duster (not h UnlGene) 2.15 

321409 C03864 EST cluster (not In UniGene) 2.15 

321205 AA002047 EST cluster (not in UniGene). 2.14 

320406 AA353895 Ks.152983 HUS1 (S. pombe) checkpoint homolog 2.14 

337646 CH22_BAAC000097.GENSCAN.1 1-2 2.13 

303084 AF174008 EST duster (not in UniGene) writh exon hit 2.13 

312185 AA654772 Hs.186564 ESTs 2.13 
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306813 


AI066544 


314465 


AA602917 


318168 


AI821782 


315990 


AI80Q041 


320712 


R66867 


318487 


AI167877 


317462 


AW015206 


304384 


AA235432 


314544 


AA399018 


319881 


T72744 


328078 




317354 


AW090770 


308617 


AI738720 


311568 


AW439969 


313605 


AI761786 


314289 


AA84811B 


332933 




325498 




313659 


AW2960S7 


324596 


AW149321 


324783 


AA640770 


302696 


AA347452 


313418 


AW450674 


326920 




327574 




323207 


AI052795 


303753 


AW503733 


305235 


AA670480 


316055 


AA693880 


317194 


AW445167 


319565 


AW4086B3 


335146 




301475 


AI678183 


312442 


M12Q970 


322502 


R62925 


303693 


AA290875 


310179 


AE15643 


321121 


W23285 


331330 


AA282197 


303557 


AA994530 


317865 


AI298794 


318667 


AI493742 


318042 


AW294522 


323818 


AW245528 


331286 


AA1370S2 


311262 


AI989942 


335601 




311351 


AI682303 


312996 


AA249018 


328190 




338030 




333940 




328227 




331481 


N27448 


335288 




307513 


AE74307 


323316 


AL134620 


319479 


R21945 


303482 


AA502583 


327489 




323935 


AW175841 


309575 


AW168096 


337043 




312897 


AI828174 


307881 


AB70434 


328656 




314569 


AA813784 


332783 


W45302 


315259 


AA701499 



Hs.156974 
Hs220587 
Hs.190555 

Hs.143716 
Hs.178784 
H&62954 
H&250835 



Hs.192271 

H&218177 
Hi204674 
H&221216 



Hs.124106 
Hs.105411 



Hs.114696 



Hs.192201 
Hs.170315 



Hs.126036 
H&32922 

Hs.170917 
Ks.143199 
H&243665 
H&30120 
Hs.171381 

Hs.89002 

Hs.129130 
Hs.165210 
Hs.149991 
Hs.134754 
Hs.103853 
HS232150 

Hs.201274 



Hs.43944 



lhB 2.13 
ESTs. 2.12 
EST* Moderately similar to 111) ALU SUBFAMILY SC WARNING ENTRY Oil [Helens] 

2.11 
2.11 
2.11 
2.11 
2.11 
2.1 
2.1 
2.1 
2.1 
2X9 
2X9 
2X9 
2X8 
2X8 
2.08 
2X8 
2X8 
2X7 
2.07 
2.06 
2.06 
2.06 
2.06 
• 2X5 
2X5 
2X5 
2X5 
2.05 
2.05 
2X4 
2.04 
2.04 
2X4 
2.03 
2.03 
2.03 
2X3 
2X3 
2.02 
2.02 
2X2 
2.01 
2.01 
2X11 
2X1 
2X1 
2 
2 
2 
2 
2 
2 
2 
2 
2 
2 

159 
1X9 
1X9 
1X8 
1X8 
1X8 
1X8 
1J98 
1.98 
1.98 



2.12 



ESTs 

EST duster (not In UniGene) 

ESTs 

ESTs 

ferritin; heavy polypeptide 1 
ESTs < 
EST duster (not In UniQene) 
CH.OBjisgJI5868008 
ESTs 

EST singleton (not In UnlQene) with exon hit 

ESTs 

ESTs 

ESTs 

CH22J=GENESX8_7 

CH.12_hsglj5866967 

ESTs 

ESTs 

EST duster (not hi UnlGene) 

EST cluster (not In UnlQene) with exon hit 

ESTs 

CR21_hsgi|6456782 

CH.03Jlsglj5867818 

ESTs 

ESTs 

EST singleton (not In UnlGena) with exon hi 

EST cluster (not In UnlQene) 

ESTs 

ESTs 

CH22_FGENES.499_2 

prostaglandin E receptors (subtype EP3) 

ESTs 

ESTs 

ESTs 

ESTs 

EST cluster (not In UniQene) 

ESTs; Highly similar to CQMJ7 protein ttisapiens] 

EST singleton (not In UniGene) with exon hit 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

CH22_FGENES£81_41 
ESTs 

EST duster (not In UniQene) 

CH.06Jtsgi|5868077 

CH22.EMAC005500.GENSCAN.148-16 

CH22J=GENES.301_6 

CHj06Jlsgi|5868105 

EST 

CH22.FGENES.527J 
EST singleton (not in UniGene) with exon hit 
EST cluster (not in UnlQene) 
ESTs 
ESTs 

CHJK_hsgi|6004459 
ESTs 



H&256153 
Hs.197271 

Hs.192183 
Hs.195188 

CH22_FGENES.439-19 
Hi227049 ESTs 

EST singleton (not in UniGene) wUh exon hit 

CR07Jisgi]6004473 
Hs.123001 ESTs 
H&87889 heUcase-mol 
Hs.148115 ESTs 
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313171 NS7879 

318060 AI241421 

332256 N 58393 

312110 AI962180 



W00545 

AA868267 

H15474 

AA862973 

AI373163 

AW090537 

AW028820 

AI820675 

AW373446 



314065 



323919 
310750 
309435 
300129 
320130 
323787 
338112 
313625 
325240 
331833 
332252 



AW468402 

AA412102 
N63882 



300279 AW237425 
326023 

321609 H86021 
324183 AA402453 
336276 
334913 
325417 

318489 AW043590 
318455 AI148763 
306890 AI092235 
315073 AW452948 
321289 R84687 
308521 AI689808 
306382 AA968967 
331320 AA262999 
324279 AA501412 
309577 AW168753 
327014 

303488 AW025860 
306561 AA995223 
330694 AA01S806 
313083 N50545 
327752 

318674 AA295490 
301267 AW297762 
332092 AA608787 
323509 AUJ36947 
321452 AA317554 
311483 AI765013 
300976 AI246374 
323715 AA322155 
313800 AW296132 
332029 AA489697 
304013 AW518573 
322019 AA354549 
334150 

310094 AW450967 
316218 AW207642 
324774 AI031771 
326507 

314570 AA405696 
336268 

315278 AI985544 
325824 

316277 AA737780 
323181 AA418583 
301438 AA961643 
307050 AI147341 
306830 A1075803 



Hs.157695 ESTs 
Hs.132236 ESTs 
Hs.102754 ESTs 
H&226803 ESTs 

CH22_FGENES529JB 
Hs.171785 ESTs 
Hs55524 ESTs 

Hs.12214 Harno sapiens <ta 23716 mRNA sequence 

Hs220704 ESTs 

Hs.170333 ESTs 

EST stnglaton (not In UnlGene) wMi exon hit 
EST duster (not In UnlGene) wBh exon hit 

HS203804 ESTs 



157 
157 
157 
157 
157 
157 
156 
156 
156 
156 
156 
156 
1.95 



Hs.169885 ESTs Waakty similar to cDlM EST EMBLT02216c^ 155 

CH22_EM*C005500.6ENSCAN.18M4 
HS-254020 ESTS 

Cai0Jisgi|5866848 
H&250911 Wartautoi 13 receptor; alpha 1 

za21B.s1 Soares fetal Sver spleen 1KFLS Homo sapiens cONA done 
IMAGE293225 3*, mRNA sequence 
H&253817 ESTs 

CH.17_hsgi|5867245 
Hs.198800 ESTs; WeaHy similar to hMmTRAlb [H.saplens] 
Hs.113011 ESTs 

CH22_FGENES.762_5 
CH22_FGENES.456_3 
CH.12_hsgf|5866925 
H&225023 ESTs 

EST cluster (not In UniGene) 
EST singleton (not In UnlGene) with exon hit 
HS257631 ESTs 
HS226306 ESTs 

EST singleton (not in UniGene) wiBi exon hit 
EST singleton (not in UniGene) with exon hit 
H&42788 ESTs 

Hs.191688 ESTs; Weakly similar to Pro-PoWUTPase polyprotsfn [M.musculus] 

EST singleton (not In UniGene) wiih exon hit 

CR21Jisgl|5867664 

EST cluster (not in UnlGene) with exon hit 
Hs.129559 EST 
Hs.108447 
Hs.159200 ESTs 

CHj05JiSBil5887949 

EST cluster (not in UnlGene) 
H&255690 ESTs 
Hs.1 12590 ESTs 

EST duster (not In UniGene) 

EST duster (not In UniGene) 
Hs509128 ESTs 
Hs.185861 ESTs 

EST duster (not In UniGene) 
Hs.1 66674 ESTs 
Hs.145053 ESTs 

Hs.156110 ImmunogtobuBn kappa variable 1D-8 

H&41 181 Homo sapiens mRNA; cDNA DKFZp727C191 (from done DKFZp727C191) 
CH22_FGENES539_1 



H&235240 ESTs 
Hs.174021 ESTs 
Hs.132586 ESTs 

CH.19_hsaI]5887435 

EST duster (not in UniGene) 

CH22_FGENES758_2 
Hs.116429 ESTs 

CH.15Jtsgl|5867048 
Hs.213392 ESTs 
Hs.143621 ESTs 
Hs.127716 ESTs 
Hs.146734 EST 

EST skigtston (not in UniGene) with exon hit 
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155 
155 
155 
155 

155 
155 
155 
154 
154 
154 
154 
1.94 
154 
154 
154 
154 
154 
153 
153 
153 
153 
153 
1.93 
153 
152 
152 
152 
1:92 
152 
151 
1.91 
151 
151 
151 
151 
151 
151 
151 
151 
151 
15 
1.9 
15 
1.9 
15 
15 
1.9 
15 
15 
15 
15 
159 
159 
159 
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302428 AUM9925 Hs22S984 DKFZP647G0910 protein 1-89 

320127 H72815 Hs.17268 ESTs 139 

337736 CH22_EMdC000097.GENSCAN.1003 1-89 

331319 AA262765 Hs.184264 ESTs 1-88 

5 310767 AI377505 Hs.158835 ESTs 1-88 

314880 AI732169 Hs.105429 ESTs 1-88 

312539 AI004377 Hs500360 ESTs 1-88 

309874 AW205604 Hs.168034 ESTs; Weakly similar to Bi! ALU SUBFAMILY SP WARNING ENTRY tin [Rsapiens] 1.88 

314621 AJ627478 Hs.187670 ESTs 1-88 

10 319435 AI972146 Hs.192756 ESTs 1-88 

313472 AA007374 EST duster (not In UntGsne) 1-88 

302705 U09060 EST dustar (not in UniGene) with exon hit 1.88 

329511 CM0_p2gi|3983514 1-88 

317140 AI699412 H&201925 ESTs 1-87 

IS 302598 AI815985 Hs.128683 ublquffifroonjugaflng enzyme E2D 1 (homologous to yeast U8C4/5) 137 

301153 AA725670 Hs.120485 ESTs; Weakly similar to serine/lhreonine kinase with SH3 domain; budne 

zipper domain and proline rtch domain [Usaplens] 137 

332222 N28271 Hs.176618 ESTs 1-87 

330703 AA055475 Hs.104143 cbthrin;^htpolypepBde(Lca) 1.87 

20 318470 A1159863 Hs.143713 ESTs 1-87 

314014 AW291847 Hs.121715 ESTs; Weakly similar to HP protein [Usapiens] 1-87 

300370 AI8278T7 EST duster (not In UniGene) with exon hit 1-BS 

312329 R84768 Hs.13399 Homo sapiens done 25032 mRNA sequence 1-86 

325587 CH.12J1S 056682462 1-86 

25 310237 AI8B4313 Hs.158906 ESTs 1-86 

318872 R13035 EST duster (not In UniGene) 1-86 

303431 AA317915 EST duster (not In UniGene) with exon hit 1-86 

338427 CH22_EMAC005500.GENSCAN^49-1 136 

300452 AB52293 Hs.191098 ESTs J 1-85 

30 . 321279 H85330 Hs.146060 ESTs 1-85 

301690 F05865 Hs249180 ubquffin-conjugafing enzyme E2E 2 (homologous to yeast UBC4/5) 1-85 

307932 AJ230822 EST singleton (not in UniGene) with exon hit 1-85 

318292 AI679966 Hs.150603 ESTs 1-85 

310254 AI239811 Hs.157491 ESTs 1-85 

35 311790 AW016437 H&233462 ESTs 1-84 

314248 AA278347 Hs.126078 ESTs 1-84 

335586 CH22_FGENES581J5 1-84 

339209 CH22_FF1 1301 1 .GENSCAN.6-4 134 

307954 AI419692 EST singleton (not in UniGene) with exon hit 1-84 

40 302549 AF055136 H&248162 tectorfn alpha 1-84 

321629 H87213 Hs.156092 ESTs 1-84 

301239 AAB07558 EST duster (not in UniGene) wHi exon hB 1-84 

332434 N75542 Hs.75356 transcription taetor 4 1-84 

327192 CH31_hsg!|5867445 1-83 

45 310214 AI220072 Hs.165893 ESTs 1-83 

320516 R33857 Hs.181479 ESTs; Weakly similar to E-SELECT1N PRECURSOR [H.sapiens] 133 

324231 W60827 EST duster (not in UniGene) 1-83 

336616 • CH22JGENES313J5 1-83 

328799 CH.07_hsgi|5868316 1-83 

50 324661 AW504161 EST duster (not bi UniGene) 1-83 

313190 AA766707 Hs.153039 ESTs 1-83 

301979 L28168 Hs.121495 potassium voiiage-gated channel; tsk-relatsd family; mamber 1 132 

302099 AL021397 Hs.1 37576 nbosomal protein L34pseudogene1 1-82 

320187 T99949 EST duster (not in UniGene) 1-82 

55 320791 R78808 Hs339B1 ESTs; WeaMy similar to Ull ALU CLASS A WARNING ENTRY IU1 [H.sapiens] 132 

305733 AA829535 Hs.84298 CD74 antigen (Invariant polypept of MHC; dass II antigen-associated) 132 

308280 AI569349 Ks.180920 ribosomal protein S9 131 

321533 W78877 Hs.40111 ESTs 131 

312946 AI915122 H&204087 ESTs; Weakly similar to F33D11.90 [Cetegans] 131 

60 319474 H90265 Hs.100638 ESTs 131 

329519 CH.10_p2BP83510 131 

324685 AA220982 EST duster (not in UniGene) 131 

320697 N62937 Hs.139181 ESTs 131 

329246 CHJLhs 015868732 1.81 

65 332000 AA481271 Hs.193945 ESTs 131 

310311 AI420990 Hs.161303 ESTs 131 

325666 CH.16JiSfli|5867076 131 

322064 Z78343 EST cluster (not In UniGene) 13 

333712 CK22_FGENES251_1 13 
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313457 AA576052 Hs.193223 ESTs 1-B 

321591 H85687 Hs.1 17927 ESTs 1-8 

330260 Ot05_p2gi|S671884 W 

311080 AI656320 Hs.197711 ESTs 1* 

5 329522 CH.10_p2gi|3983507 1* 

322889 AA081924 Hs211417 ESTs 1-8 

300175 AE75011 Hs^04877 ESTs 18 

330976 H20560 H&244624 ESTs 1-8 
300208 AB41180 Hs.1961 15 ESTs; WeaMy similar to FIBRILLIN 1 PRECURSOR [Rsapbns] 1.79 

10 319635 R17531 EST cluster (not In UniGene) 1.79 

313454 AA730S73 Hs.188634 ESTs 1.79 

S03093 AHO0310 Hs.1 48958 ESTs 1-79 

309815 AW292760 EST singleton (not In UnlGana) w»i axon hit 1.79 

326506 CH.19_hsgIp867435 1-79 

IS 319845 AA649011 Hs.187902 ESTs 1-79 

300230 AB23739 Hs.186387 ESTs 1-79 

312180 AE48285 Hs.1 18348 ESTs 1-79 

313058 081015 Hs.125382 ESTs 1.79- 

330120 CH.19_p2gi)6S71864 1.78 

20 328412 CHXf7Jts #868405 1-78 

302345 NMJM0565 EST duster (not to UnlGera) with exon hit 1.78 

308100 AW75949 EST singleton (not in UniGene) with exon hit 1.78 

311386 AW205705 K&207514 ESTs 1.78 

330282 CRO5j2gi|6671910 1.78 

25 318856 Z43011 Hsi1169 ESTs 1.78 

312486 AA845630 Hs.117904 ESTs 1-78 

325450 CH.12_hsgi]5866941 1.78 

321206 H54178 Hs326469 ESTs 1-78 

330977 H20826 Hs31783 ESTs 1-78 
30 303487 AA333666 EST cluster (riot in UniGene) wfih axon hit 177 

310398 AB64671 Hs.164166 ESTs 1.77 

313230 AI540166 Hs.129563 ESTs 1-77 

317747 AI683782 Hs.128245 ESTs 1.77 

303381 AL03B841 Hs.163313 ESTs; Weakly similar to in ALU SUBFAMILY SB WARNING ENTRY OU [H^apisns] 1.77 

35 336123 CH22_FGENES.701_8 1.77 

300185 AK86182 H&208484 ESTs 1.77 

316002 AW451733 Hs.1 19824 ESTs 1.77 

319850 AA001811 H&83722 ESTs 1-77 

329941 CK16_p2g?6165199 177 

40 328329 Ca07_hsgl|5858375 1.77 

322934 AI493054 Hs.158968 ESTs 177 

325902 Cai6jisgil5867101 1-76 

322239 W01813 Hs.12109 WD40 protein Oao1 1-76 

303530 AI274851 H&25B744 ESTs 1-76 

45 300980 AI025527 Hs.222097 ESTs 1-76 

331909 AA437300 Hs.178210 ESTs 1-76 

321553 K92449 Hs.1 16406 ESTs 1-76 

301618 T52760 EST duster (not In UniGene) with axon hit 1.76 

319592 AA627356 Hs.163315 ESTs 1.76 

50 318511 T26528 Hs.227175 ESTs; Weakly similar to DIJ ALU SUBFAMILY SQ WARNING ENTRY III! [Hsapiens] 1.76 

327183 CH.01_hsgiI5887442 1-76 

313516 AA029O58 Hs.135145 ESTs 1-76 

318644 AI752482 EST duster (not in UniGene) * 1.76 

321632 AA419617 EST duster (not in UniGene) 1.76 

55 324657 AW451142 H&255628 ESTs 1-76 

300437 AW449374 H&257149 ESTs 1-75 

319775 AA504429 Hs.6211 matrryl-QpG binding domain protein 1 1.75 

314775 AI149880 Hs.188809 ESTs 1.75 

337460 CH22_FGENES.78M 1.75 

60 309849 AW297444 EST singleton (not to UniGene) with exon hit 1.75 

301471 AA995014 Hs.129544 ESTs; WeaMy similar to ORF YLL027w [S.cerevisiae] 1.75 

312739 AI318426 Hs.155925 ESTs 1-75 

319995 H15355 Hs.60887 ESTs 1.75 

326495 CH.19Jisgi|5867423 1.75 

65 337497 CH22.FGENES.801-4 1.75 

322633 AA004534 Hs.153931 ESTs 1.75 

332177 F10812 Hs.101433 ESTs 1.75 

326930 CK21Jisgi]6456782 1.75 

316893 AAS37332 EST cluster (not in UniGene) 1.75 
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324826 AA704806 Hs.143842 ESTs 1.75 

311289 A1656924 Hs.174257 ESTs 1.75 

309375 AW075342 EST singleton (not in UnlQene) with exon hit 1.75 

314171 AI821895 Hs.193431 ESTs 1.75 

311684 A1990741 H&252809 ESTs 1.75 

334387 CH22J=GENES580_1 1.75 

312195 A130010I Hs^52222 ESTs 1.75 

315707 AI418055 Hs.161160 ESTs 1.74 

324349 AW501470 EST duster (not In UnlGene) 1.74 

300724 AI762929 Hsi06134 ESTs; Weakly similar to similar to reverse transcriptase [Calegans] 1.74 

309906 AW339340 EST singleton (not in UniGene) with exon hit 1.74 

303714 AW501336 EST duster (not In UnEene)w9\ exon hit 1.74 

318704 Z24S81 EST duster (not In UnBene) 1.74 

303027 AF111178 EST duster (ncl In UnlGene) wih exon hR 1.74 

322601 W92924 EST duster (net h UniGene) 1.74 

319382 H93199 HS33665 ESTs 1.74 

315858 AA737345 EST duster (not In UniQene) 1.74 

332243 N55484 Hs520540 ESTs; Highly similar to ARYL HYDROCARBON RECEPTOR NUCLEAR 

TRANSLOCATOR [Hsapiens] 1.74 

330951 H02566 Hs.191268 Homo sapiens mRNA; cONA OKFZp434N174 (from clone DKFZp434N174) 1.74 

324044 AL045752 Hs311519 ESTs 1.73 

320630 AA199847 EST duster (not in UnlGene) 1.73 

327283 CHX1_hsgi)5867481 1.73 

314986 AI201367 Hs.142860 ESTs 1.73 

319078 H17255 Hs.144515 ESTs 1.73 

326278 CH.17_hsfli|5867269 1.73 

302552 H49782 EST duster (not In UnlGene) wil exon hit 1.73 

322322 AF08S431 EST duster (not in UnlGene) 1.73 

327075 CH.21_hsgij6531965 1.73 

317392 AI797588 Hs.145459 ESTs 1.73 

300810 AI076890 Hs.186949 ESTs 1.73 

315978 AA830893 Hs.119769 ESTs 1.73 

323903 AA773580 Hs.193598 ESTs 1.73 

330803 AA004699 Hs.150580 putaSve translation Initiation factor 1.73 

309845 AW2S68Q2 Hs£55580 EST 1.73 

314963 AK89617 Hs.200934 ESTs 1.73 

311710 F09774 Hs.175971 ESTs 1.73 

315315 AI984592 Hs.15088 ESTs 1.73 

300378 AA663560 Hsi35873 ESTs; Weakly slmOar to K11C42 [Calegans] 1.73 

316141 AW303457 EST duster (not In UnlGene) 1.72 

319826 T71739 Hs.75442 albumin 1.72 

312961 AI033922 Hs.122517 ESTs 1.72 

334379 CH22_FGENES.379_11 1.72 

305854 AA862733 EST singleton (not in UniQene) with exon hit 1.72 

313031 N34927 Hs.186566 ESTs 1.72 

329728 CH.14_p2gi|6065785 1.72 

312090 N57692 Hs.1 18054 ESTs 1.72 

323341 AL134875 Hs.192388 ESTs 1.72 

302077 AA310580 Hs.132898 HornosapienscruDnusome11;BACCIT4ISP-311e8(BC269730) 

containing the hFEN1 gene 1.71 

310766 AI971438 Hs.158824 ESTs 1.71 

311450 AI809985 Hs303340 ESTs 1.71 

311792 AW238064 H&253909 ESTs * 1.71 

321500 H71S99 EST cluster (not in UniQene) 1.71 

311943 T78791 Hs241569 ESTs; Modeiately smtr to IIU ALU SUBFAMLY SQ WARNING ENTRY 121 [H^apiens] 1.71 

302270 R56151 EST duster (not In UnlGene) with exon hit 1.71 

329089 CttXJisgi]5868614 1.71 

322331 AF086467 EST cluster (not In UnlGene) 1.71 

318235 AI080361 Hs.134217 ESTs 1.71 

304561 AA489792 EST singleton (not In UniQene) with exon hit 1.71 

312681 AIQ28149 Hs.193124 pyruvate dehydrogenase kinase; isoenzyme 3 1.71 

310250 AI478629 Hs.158465 ESTs 1.71 

338178 CH22_EM:ACOO5SO0.GENSCAfi21 9-6 ' 1.71 

338910 CH22_DJ32hO.GENSCAN.11-2 1.71 

321225 AL080073 H&251414 Homo sapiens mRNA; cDNA DKF7p564B1462 (from done DKFZp564B1462) 1.7 

322289 AA5345S0 Hs539 *osomal protein S29 1.7 

319802 AI7014B9 Hsi02501 ESTs 1.7 

314022 AW452420 H&248678 ESTs 1.7 

314937 AA515602 Hs.152330 ESTs 1.7 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



300580 AA761322 Hs_20538 ESTs 

304398 AA262785 EST singleton (not In UniGene) with exon hit 

313421 AW339515 Hs.163700 ESTs 

309763 AW270182 EST slnglaton (rat In UniGene) wffii exon h_ 

322092 AP085833 EST cluster (not in UniGene) 

315603 AA764768 Ks.121158 ESTs 

325031 T08597 EST cluster (not In UniGana) 

327157 CH01JlS8ij5B66841 

314809 AI741461 Hs.161904 ESTs 

320361 H67220 Hs.146406 nftrilase 1 

324721 AW4O2302 Hs.43616 ESTs 

328624 CHj07_hsgp68246 

303344 AA2S5977 H&250646 ESTs; Highly stater to 

328960 CK08JIS ^6456775 

315702 AA657501 Hs.146315 ESTs 

302385 AJ224172 Hs204096 Epoprfc B (uteroglobin famBy member); prostateM® 

319699 R14537 EST duster (not In UniGene) 

309506 AW137700 EST stngtaton (not tn UniGene) with exon hit 

330417 D84424 Hs57697 hyaluronan synthase 1 

315295 AA876905 Hs.125286 ESTs 

328538 CH.07_hsgi£868485 

323923 AA354146 EST duster (not tn UniGene) 

320303 AL079289 Hs.137154 Homo sapiens mRNA faiO length Insert cONA done EUROMAGE 35971 

302967 AI927068 Hs.110853 £STs; WeaMydmHarto R10D12.12 [Celegans] 

310695 A1472124 Hs.157757 ESTs 

307512 AE73815 Hs.242463 keraflnB 

333506 CH22_EM^C00550aGENSCAM390-10 

331722 AA195405 Hs.110347 Homo sapiens mRNA for alpha integral bin. 

301431 R05385 EST duster (not in UniGene) wi8i exon hit 

318853 Z42977 Hs.21062 ESTs 

323032 AW244073 Hs.145948 ESTs 

317538 AW137772 Hs.185980 ESTs 

325780 CH.14_nsg^63B1953 

321739 AUJ80280 EST duster (not In UniGene) 

31S808 T58960 EST duster (not in UniGene) 

313443 AA243037 EST duster (not In UniGene) 

331366 AA424754 Hs.43149 ESTs 

316443 AI797592 H&207407 ESTs 

322878 AA081820 EST cluster (not in UniGene) 

330320 CHX»j)2gl|5932415 

329081 CHJUtsgl|5868602 

334026 CH22J=GENES318_3 

317791 AI801500 Hs.128457 ESTs 

ymxi AF086106 EST cluster (not In UniGene) 

331148 R73816 Hs.17385 ESTs 

325452 CH.12_hsgi|5666941 

315106 AW452184 H&232100 ESTs 

326014 CH.16_hsgl|5867160 

307130 A1185234 EST singleton (not In UniGene) with exon tut 

300943 AA524545 H_224630 ESTs 

319402 W21298 EST duster (not In UniGene) 

310889 AI457946 Hs.170437 ESTs; Weakly similar to hyperpolarizatian-activated; cycSc 
nudeotide-gated channel 2 [lisapiens] 

AL135118 EST cluster (not in UniGene) 

CH22jF6ENESX81_4 

AW263086 Hs.118112 ESTs 

CH22_PA59H18.GENSCAN.3-1 
CH.16_p2gp23963 

AW205477 Hs.179891 ESTs 

CH22_FGENES.395J 

AI064824 Hs.193385 ESTs 

AW204480 H_253414 EST 

AW148928 H-248895 EST 

AI421641 EST singleton (not In UniGene) with exon hit 

AW369770 Hs.130351 ESTs 

AA401858 H&224843 ESTs 

CH22_EM-AC005500.G9iSCAN517-t6 

AA232729 Hs.154302 ESTs 

AW139993 Hs.163682 ESTs 
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335568 
320654 
338983 
330002 
315343 
334487 
312169 



309518 



316787 
300835 
338763 
303327 
313231 



1.7 

1.7 

1.7 

1.7 

1.7 

1.7 

1.7 

1.7 

1.7 
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1.69 
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1X8 

1X8 

1X8 

1X8 

1X8 
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1.68 
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1.67 
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1.67 
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1.67 

1X7 

1X7 

1.67 
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1.66 
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1.66 
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334073 CH22_FGENES.327_28 1-65 

319901 T77138 Hs*765 RNA haticase-relatBd prtrtatn 1-65 

326530 CH.19Jlsgi|5867441 1-65 

301126 AB02877 H&210843 ESTs;WBEJdys!irtlarto(U1039KS2[asapi9ns] 1.65 

314043 AAS27082 EST cluster (not In UnlGene) 1-65 

304387 AA236027 EST singleton (nol in UnlGene) v*h exon hit 1-65 

322932 AA099732 EST cluster (not tn UniGerte) 1-65 

837272 CH22_FGENES.660-1 1-64 

332694 AA262768 H&243901 K1AA1067 protein 1-64 

318996 Z44266 EST cluster (not tn UntGena) 1-64 

315338 AW342028 Hs*56112 ESTs 1-W 

313329 AW293704 H&122658 ESTs 1-64 

318088 AW295409 Hs.137945 ESTs 1-64 

313835 AI53B438 Hs.159087 ESTs 1-64 

320035 AA378974 Hs.130720 ESTs; Weakly similar to CELUULAR NUCLEIC ACID BINDING PROTEIN [Ksapiens] 1j64 

309372 AW074330 EST singleton (not hUniGeneJwffliexonhB 1-63 

324157 AW402236 EST cluster (not in UniQene) 1-63 

323929 AA354940 Hs.145958 ESTs 1-63 

302490 AA885502 Hs.187032 ESTs '-63 

333942 CH22J=GENES301J 1-63 

327469 CH02_hsgil5867772 1-63 

301918 AA476777 EST cluster (not in UniQene) wSh axon hit 1-63 

315664 AI744068 Hs.160712 ESTs 1-63 

304405 AA282572 . EST singleton (not In UnlGene) wffli exon hit 1.63 

310624 AI341594 Hs.157522 ESTs; Moderately similar to env protein [H^apiensJ 1.63 

319250 F11623 EST cluster ftwt in UniQene) 1-63 

310608 AI982234 Hs.196102 ESTs 1-63 

317348 AI348078 Hs&31 3+yoraxyrr^yl<3™ihylgtutaryKk>enzym8 A lyase (hydroxymethylglutaricaciduria) 1.63 

306513 AA989230 EST singleton (not in UnlGene) with exon hft 1j63 

320807 AA086110 Hs.188536 Homo sapiens clone 24838 mRNA sequence 1.63 

303710 AE69069 H&250852 ESTs; Highly sfmSar to ubtquifin hydrotyzing enzyme I [Rsaptens] 1.63 

328291 CR07jisgq5868363 1-63 

304236 W93278 EST singleton (not In UnlGene) wim exon hit 1-63 

317683 AI791700 Hs.127893 ESTs 1-63 

311960 AW440133 Hs.189690 ESTs 1-62 

312834 AI028309 Hs.114246 ESTs 1-62 

825326 CK11_hsgi|5866875 1-62 

313663 AB53261 Hs.169813 ESTs 1-62 

327526 CR02_hsgil6381882 1-62 

300429 AW449679 ' Hs.156739 ESTs; Highly similar to XG GLYCOPROTEIN PRECURSOR [H^aplens] 1.62 

305169 AA663131 EST singleton (not In UnlGene) wBh exon hit 1.62 

316621 AM21995 Hs.122138 ESTs 1-62 

329666 CH.14_p2gi|6272129 1.62 

318035 AI744130 Hs.131201 ESTs 1-62 

300492 AL031709 mufiiple UnlGene matches 1-62 

316532 AI307229 Hs.184304 ESTs 1-62 

332048 AA496019 Hs.201591 ESTs 1-62 

307113 AI18368S EST singleton (not In UnlGene) with exon hit 1.62 

319127 N49476 EST cluster (not In UnlGene) 1.62 

331155 R876S0 Hs33439 ESTs; Weekly similar to All ALU SUBFAMILY J WARNING ENTRY SB [Rsapiens] 1.61 

338220 CH22_EM:AC005500.GENSCAN.24&9 1.61 

315763 AW515270 Hs.118342 ESTs 1-61 

323571 AA984133 Hs.153260 c-CbHnteracting protein * 1-61 

312240 R28628 H&203669 ESTs 1-61 

304569 AA490934 EST singleton (not tn UnlGene) with exon hit 1.61 

313179 AI076101 Hs.131704 ESTs 1.61 

326658 CR20_hsgi|6552462 1-61 

317276 AI823847 Hs.129986 ESTs 1-61 

312572 AA350125 Hs.187499 ESTs 1.61 

311932 AW451654 H&257482 ESTs 1-61 

302103 AA452310 H&26090 ESTs; Weakly similar to T20B12.1 [Cetegans] 1.61 

308413 AI636253 Hs.196511 EST 1.61 

310077 AI620617 Hs.148565 ESTs 1-61 

337780 CH22_EM*C0O)097.GENSCAN.121-2 1j61 

327796 CH.05_hsgl|58679B2 1.61' 

308352 AI610791 EST singleton (not In UnlGene) wffii exon hB 1.61 

324539 AI378032 Hs.1 25892 ESTs 1-61 

303232 AA437414 EST cluster (not in UnlGene) wiSi exon hR 1j61 

337884 CH22_EMJWX0550aGENSCAN54-2 1.61 
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303620 AA397546 Hs.119151 ESTs 151 

303481 AA336839 EST cfustar (not In UniGene) with exon hit 151 

314481 AA548589 Hs.105846 ESTs 1.61 

300327 AB08894 H&24S893 ESTs 1.6 

323473 AA262442 EST duster (not in UnlGene) 1.6 

325154 CH.17Jlsgl|5867170 15 

331920 AA446885 Hs.99087 - ESTs; Moderately similar to ZINC FINGER PROTEIN 141 [KLsapiens] 1.6 

323827 AW406878 EST cluster (not in UnlGene) 1.6 

322452 W56710 EST cluster (not in UnlGene) 15 

310597 AI739071 Hs.158515 ESTs 1.6 

307871 AI358665 EST singleton (not In UnlGene) wHh exon 1)3 1.6 

322215 AF088005 EST duster (not in UniGene) 15 

318420 AI139B57 H&143837 ESTs 15 

332217 H93S87 Hs.102383 EST 15 

324937 M79230 Hs.182388 ESTs 1.6 

320543 AF052176 Hs.158529 Homo sapiens dona 24457 mRNA sequence 15 

300S74 AW467383 EST duster (not In UnlGene) wffli exon hH 1.6 

315193 AI241331 Hs.131765 ESTs 15 

319713 R24204 EST cluster (not In UniGene) 15 

301210 AI379982 Hs.158944 ESTs 1j8 

309365 AWD72881 EST stngteton (not in UnlGene) wffli exon hit 15 

321403 AW451454 H&247568 adenylate kinase 3 15 

321908 AA376936 HS20998 ESTs 1.6 

303349 AA382661 EST cluster (not In UniGene) with exon hit 1.6 

324338 AL138357 H&247514 ESTs 1.6 

310599 AW300144 EST cluster (not In UnlGene) 15 

333193 CH22J=GENES58_15 15 

338433 CH22_FGENES.825_12 1.6 

312097 AI352096 Hs.157169 ESTs 15 

311445 AW204237 Hs.192703 ESTs; Weakly similar to Ml ALU SUBFAMILY J WARNING ENTRY 111) [Ksaptens] 159 

317736 AI361722 Hs.192410 ESTs 159 

303147 AI498991 EST singleton (not In UniGene) wfth exon hit 159 

313489 AA017492 Hs.135655 ESTs 159 

316289 AA902488 Hs.122952 ESTs 1.59 

32S983 CR21_hsgq58B7657 159 

314781 AW205298 H&202372 ESTs 159 

328397 CR07_hsg!|5868397 159 

331970 AA461084 Hs.187677 ESTs 159 

321744 N91419 Hs.12028 ESTs 159 

310509 AK92181 Hs.150036 ESTs 159 

315921 AI147545 Hs.114172 ESTs 159 

322049 AI928242 Hs.144383 ESTs 159 

301161 AA731518 EST duster (not in UniGene) wffli exon hit 159 

300548 AI026836 Hal 14689 ESTs 159 

319142 F073SS EST duster (not In UnlGene) 159 

313526 AW152263 Hs249243 ESTs 159 

305937 AA883238 EST singleton (not In UnlGene) wffli exon hft 158 

330123 CH.19_p2 gi|6671869 158 

327819 CH55Jlsg55867968 158 

318250 AI478814 H&134603 ESTs 158 

308760 A1034094 Hs.169476 tubuEn; alpha; ubiquitous 158 

322358 AA220235 HS246836 ESTs 158 

317866 AI690269 Hs201345 ESTs 158 

320725 AA703319 Hs.120967 ESTs 158 

311332 AW292247 H&2S5052 ESTs 158 

334893 CH22_FGENES.452_7 158 

318730 AA398215 EST cluster (not in UniGene) 158 

315889 AW271639 Hs221744 ESTs 158 

303702 AW500748 H&224961 ESTs; Weakly similar to 73 kDA subunH of cleavage and polyadsnylation 

specificity factor [Usaplens] 157 

315086 AI492660 Hs.170935 ESTs 157 

332514 AA156499 Hs£454 protein kinase; cAMP-dapendent; regulatory; type B; alpha 157 

335549 CH22_FGENES576_10 * 157 

329532 CR10_p2gI|39835O5 157 

323140 AA180467 EST duster (not tn UniGene) 157 

313168 AI801098 Hs.151500 ESTs 157 

337896 CH22_Et*ACO05500.GENSCAN.56-3 157 

330658 AA319514 Hs211093 ESTs 157 

324585 AI823969 Hs.132678 ESTs 157 
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10 



15 
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35 



40 



45 



50 



55 



60 



65 



3171S1 


AW298195 


H&255735 ESTs 


157 


308818 


AB19700 


HS508231 EST 


157 


326547 




CR19_hsgI15867307 


157 


318833 


H06234 


H&24888 ESTs 


157 


320488 


R31386 


EST duster (not In UnlGene) 


157 


306929 


AI124514 


EST singleton (not In UnSGane) wffli axon hit 


157 


338083 




CH22_EM^CO0550aGENSCAN.174-1 


157 


316868 


A1660698 


Hs.195602 ESTs 


157 


"310077 




ns.l/U4aU co IS 


IJaf 


328638 




Ca07_hs 0^6004473 


157 


310074 


AI651039 


Hs.148559 ESTs 


156 


327058 




CH21_hsgp531985 


156 


320076 


AI653733 


H&204079 ESTs 


156 


322345 


AF086529 


EST duster (not In UnlGene) 


156 


314731 


AI745498 


H&204579 ESTs 


156 


318687 


H49619 


Hs.127301 ESTs 


156 


303841 


AI934464 


EST cluster (not in UniQene) wBh exon hS 


158 


302370 


AJ009849 


Hs.199297 Homo sapiens GNAS1 gane encoding NESP55 


156 


322571 


AF156271 


EST cluster (not in UnlGene) 


156 


318050 


A1052093 


Hs.133132 ESTs 


156 


303388 


AUJ39604 


EST cluster (not in UnlGene) wifli exon hit 


156 


323758 
328369 


AA833858 


EST cluster (not in UnlGene) 


156 
156 


329415 




CH07Jisgij58683a8 
CaYJlsgi|5868874 


156 


303915 


AW468839 


H&257767 EST 


156 


338794 




Cn22_EnuACu05500.ucNSCAI%528'1 


156 


303074 


AA243481 


Hs.127320 ESTs; Weakly similar to K1AA0346 [iUaptens] 


156 


318807 


F08434 


EST cluster (not h UnlGene) 


156 


334287 




CH22_FGENES.369_17 


156 


311928 


AW024788 


H&233374 ESTs 


155 


304592 


AA505833 


Hs.162017 EST 


155 


300765 


AAS82913 


Hs£47179 ESTs;WeaMysMartoKlAA0319[H^apiens] 


155 


304921 


AA603092 


EST singleton (not in UnlGene) with axon hK 


155 


324605 


AW502851 


Hs349978 ESTs 


155 


324473 


AW501163 


EST duster (not h UnlGene) 


155 


300568 


H88709 


Hs.21371 son of sevenless (ProsophOa) homotog 1 


155 


314165 


AA761265 


Hsi21281 ESTs 


155 


302868 


AA157392 


EST cluster (not in UnlGene) with exon hS 


155 


314034 


AI299137 


Hs.154214 ESTs 


155 


325389 
331849 


AA417078 


CH.12_hsgiI5866921 
Hs.193767 ESTs 


155 
155 


320536 


AA331732 


Hs.137224 ESTs 


155 


303347 


AA258033 


EST cluster (not In UnlGene) wifli exon hit 


155 


315769 


AA744875 


Hs.189413 ESTs 


155 


317031 


AA973297 


Hs.126101 ESTs 


155 


300203 


AI827065 


Hs224877 ESTs 


155 


304037 


T26438 


EST singleton (not in UnlGene) with exon hft 


155 


322613 


AW160507 


EST duster (not In UnlGene) 


154 


317887 


AW138174 


Ks.130651 ESTs 


154 


322313 


AP086386 


EST duster (not in UniGene) 


154 


323992 


AW411383 


Ks.169688 ESTs 


154 


325303 




CH.11Jisgi]5866908 


154 


312701 


AI457663 


Hs.128127 ESTs 


154 


304787 


AA582678 


EST singleton (not In UniGene) with exon hit 


154 


305849 


AA861571 


EST singleton (not In UnlGene) with exon hit 


154 


314557 


AA401367 


Hs.128647 ESTs 


154 


316507 


AI381515 


Hs.158381 ESTs 


154 


315023 


AA533505 


Hs.185844 ESTs 


154 


314920 


AA513406 


Hs.152307 ESTs 


154 


323097 


2X4354 


Hs.180950 guanine nucleotide binding protein (G protein); q polypepfide 


154 


325043 


W27919 


Hs32944 Inositol potyphosphata-4-phosphatasa; type 1; 107kD 


154 


307892 


A076O86 


Hs.158759 EST 


154 


324573 


AA491600 


Hs.161942 ESTs 


154 


313092 


AI923673 


Hsi12827 ESTs 


154 


324695 


AA641092 


H&257339 ESTs 


154 


303019 


AF098363 


EST duster (not in UnlGene) with exon hit 


154 


317158 


AI459140 


Hs.129109 ESTs 


154 


309536 


AW151933 


EST singleton (not in UniGene) with exon hit 


154 


301568 


AI146423 


Hs.146709 ESTs 


153 
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315574 M551923 Hs.191850 ESTs 153 

321881 N79341 EST duster (not in UniGene) 153 

310890 AI184510 Hs.143728 ESTs 153 

330038 CH.17_p2gil6O42048 153 

316907 AA843868 Hs.190567 ESTs 153 

312299 AA972712 Hs.174818 ESTs 153 

331128 R513S1 Hs23423 ESTs 153 

305177 AAS63591 EST singleton (not In UniGene) with axon hit 153 

337685 CH22_EMACOO0097.GENSCAN.77-1 153 

335230 CH22_FGENES527_3 153 

308898 A1856667 EST singleton (not tn UntGane) with axon hit 153 

307844 AM18246 EST singleton (not in UnlGena) with exon hit 153 

300867 AW340374 Hs.121033 neural precursor caD expressed; dsvatopmantaDy down-regulated 1 153 

335320 CH22J=GENES534_7 153 

32S841 CH.14jl2giI6672062 153 

317918 A1565071 Hs.159983 ESTs 153 

332901 CH22_FGENES.36_2 153 

305413 AA724659 EST slngbton (not in UniGene) with exon h8 153 

316707 AI016387 Hs.184406 ESTs 153 

313693 AW469180 Ks.170651 ESTs 153 

316101 AA922236 H&221037 ESTs 153 

320796 AF038966 Hs.184543 secretory carrier mambrane protein 1 153 

307451 A1248S15 EST sinflbton (not in UniGene) wittt exon hit 153 

323648 AI678968 H&152060 ESTs 153 

331482 N2751S Hs.40296 ESTs 153 

318059 AIQ23175 Hs.167022 ESTs ' 153 

325958 CH.16_hsgil5867142 153 

315736 AA664265 H&230213 ESTs 153 

314740 AW015667 Hs.119427 ESTs 152 

314117 AA224368 Hs.185164 ESTs 152 

301646 AA313954 EST cluster (not in UniGene) with exon hit 152 

338752 CH22_EMAC005500.GENSCAN513-10 152 

309314 AW009312 EST singleton (not in UniGene) with exon hit 152 

301445 AI208364 Hs.128233 ESTs; Weakly similar to REGULATOR OF CHROMOSOME 

CONDENSATION [H.saplens] 152 

308501 AI685263 H&201150 EST 152 

312330 AAS35305 Hs.121574 ESTs 152 

318040 AI018150 Hs.148781 ESTs 152 

336205 CH22_FGENES.719_10 152 

325701 CH.14Jisgil5867028 152 

315009 AW189460 H&208358 ESTs 152 
303121 AW407585 H&27769 ESTs; WeaMy similar to mCAC [Mmiscutus] 152 
309271 AI986221 EST singleton (not in UniGene) with exon hit 152 
328385 CHXJ7Jlsgil5868395 152 
307700 AI318545 EST singleton (not in UniGene) with exon hit 152 
314591 AW103292 HS245328 ESTs 152 
304484 AA432067 Hs258373 ESTs 152 
304382 AA232873 EST singleton (not In UniGene) with exon hit 152 
304232 W52674 EST singleton (not in UniGene) with exon hit 152 
309853 AW298169 Ks57553 tousled-tike kinase 2 152 
312504 AW207346 Hs.143202 ESTs 152 
313134 N63406 Hs258697 ESTs 152 
330391 AF015950 Hs.115256 telomerase reverse transcriptase 152 
314342 AI873046 H&258775 ESTs 151 
305977 AA887293 EST singleton (not in UniGene) with exon hit 151 
301165 N85789 Hs224155 ESTs; Weakly similar to PTER1N-4-ALPHA-CARBINOLAMINE 

DEHYDRATASE [H Kaplans] 151 

300613 A1932294 H&249604 ESTs; Weakly similar to B-CELL LYMPHOMA 6 PROTEIN [Hsapiens] 151 

324124 AI554212 Hs.185664 ESTs; Weakly similar to SERINE/THREONINE-PROTEIN KINASE NRK2 (H^aplens] 151 

308037 AI458207 Hs.174181 ESTs 151 

323909 AL043148 Hs.186257 ESTs 151 

315464 AW139500 Hs.116135 ESTs 151 

306700 AI022056 EST singleton (not ki UniGene) with exon hit 151 

337976 CH22_EMAC005500.GENSCAN.107-1 151 

306855 AI083982 EST singleton (not In UniGene) with exon hit 151 

311045 AI569399 Hs.174748 ESTs 151 

315010 AA531082 H&240049 ESTs 151 
310205 AW025248 Hs.202445 ESTs 151 
310759 AW135924 H&224883 ESTs 151 
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310954 


AW449044 


Hs.171298 ESTs 


151 


312019 


T77048 


Hs.188750 ESTs 


151 


334773 




CH22_FGENESj430_5 


151 


332043 


AA490831 


Hs.125056 ESTs 


151 


322850 


AA296219 


EST duster (not bt UniGene) 


151 


337920 




CH22_BtAC005500.GENSCAN£7-3 


151 
151 


328993 
309245 


AI972447 


Ca09Jisg$888538 

EST singleton (not In UniGene) with exon hit 


151 


312172 


AE22168 


Hs.191168 ESTs 


151 


304039 


T47349 


EST singleton (not In UniGene) with exon ha 


15 


301329 


AI149653 


Hs.190498 ESTs 


15 


313376 


AI949246 


Hs200381 ESTs 


15 


324248 


AW504918 


EST duster (not In UniGene) 


15 


308771 


M809301 


EST singleton (not In UniGene) with axon ha 


15 


334935 




CH22_FQENE&464_3 


15 


319764 


AA019827 


EST cluster (not In UniGene) 


15 


318519 


T27135 


EST duster (not In UniGene) 


15 


332807 




CH22_FGBIES.7_9 


15 


322310 


AF08S376 


EST cluster (notinUntGene) 


15 


324557 


AA48916S 


Hs.156933 ESTs 


15 


332118 


AA609585 


Hs.162689 EST 


15 


319539 


R09027 


EST duster (not In UniGene) 


15 


313149 


AW291092 


HS201058 ESTs 


15 


329722 




CR14_p2gi|606578S 


15 


323514 


AA861209 


EST duster (not In UniGene) 


15 


308078 


AI472621 


EST singleton (not in UniGene) with exon hit 


15 


337985 




CH22_BAAC005500.GENSCAN.100-10 


15 


335905 




CH22_FGENES£35_13 


15 
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TABLE 14A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 14. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Ptay: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322064 234514J 

321409 197838.1 

322092 46678.1 

3214S2 212379J 

313603 189797.1 

320856 36098.1 



322139 
321500 
313733 
322215 

321632 
313833 
322310 
322313 

322331 
322345 
322347 
322370 
321739 
321781 
314570 
300129 
322452 
321861 
323140 

321914 
322571 
322574 
314753 
300370 



46806.1 

552826.1 

441212.1 

47002.1 

47070.1 

286374.1 

120893.1 

47376.1 

47386.1 

47434.1 

47467.1 

47537.1 

47545.1 

187612.1 

43998.1 

1511778.1 

280469.1 

635249.1 

497108J 

1651920.1 

159551.1 

38916.1 

85114.1 

22297.1 

39412.1 

311451.1 

3910J 



322601 577912.1 
322613 34330.1 



316055 409389.1 
323316 981458.1 
300492 25768.1 



BE261397 Z78343 BE176419 AA383657 N90640 AA334052 AW955761 BE536232 AA374087 AA584776 

N71838 AA282003 T54072 AA761419 H92966 AI831371 AI0S5435 AI690247 R99331 AW9641 10 AA975590 AA346128 

H94198 C03864 

AF085333 R69689 AW341677 AA923375 BE327566 AW630415 R69601 AW615339 
AW9S2489 K64300 AA329527 
AA284333 AW4681 19 AA284334 AA81 0992 

AB040928 T94673 AE89313 AI53S039 Z44366 BE141499 D60116 D61488 D59945 AA419503 R2B030 R72986 K0325S 
AI189112 AI912312 AW51 1018 AI401349 AW470144 C14624 AI335797 Z40300 AI014456 D60269 D601 15 T16722 AB70673 
D60270 

H53744AF075088HS3797 
BE004271 AI248023AI022157 H71999 
AA766346 AA809877 AA8361 16 AW469598 AW977404 
AF088005 N51816N51731 

AF086106 Al 1 93589 AW665594 N7179S AA722627 AW665373 AI3002S1 

AW812795 AA419617 H87827 AW299775 AW382168 AW382133 BE171659 AW392392 BE171641 AA541393 

AA766825 AA811180 AA085906 AI762946 AW977820 

AF086376 W77804 W72689 AA837735 

AF0863B6W77947W72708 

ARJ86431 AA886756 AI557237 

AF086467W81444W81445 

W95298 AF086529 AI912190 AW294159 AI458747 W94782 

AF086538 W95969 AJ63191 1 W95835 

AA330095 W25112 AA249401 

AL080280 T73124 H02689 AL08Q281 

D78667 078871 C18258 

AA904776 AA405696 AA405962 

AW028820AE19068 

AI147202 W56755 W56710 

N79341N99082 N47551 

AA1 80467 AA449184 AA464831 AA505048 

T55953 T57205 AF147346 

AA011603 N58604N58611 

NM.016102 AF156271 AA781868 AW152318 AW7704O3 AA909463 AA482996 AA758672 
AF156548 AA639797 AI675267 A1825497 AI823355 
AA463262 AA463615 AW160405 AW407583 

AW136181 AA581939 AK001221 AA694538 AA424043 A1016272 AA098960 AA884473 AI356180 BE391633 AA437086 
AI277866 AA098827 AA992680 BE172624 AA424101 AA320776 AW962967 N77431 AW858960 AW858897 T85649 
AA357743 AI827817 A1905672 

AI082395 W92924 BE048524 AW005302 AI084474 AI369330 AI827710 AW13S506 AW298694 

AW160507 NM.013367 AF191333 AA384939 AI445790 AA730309 BE397003 BE267753 AI979163 N50386 AW583671 

AW583608 BE074466 BE074479 BE074471 AW976283 AA604393 AW162122 W73648 AI823475 N75898 W73713 

AW470099 AW513238 AW025055 AW613115 AI923379 W58081 AW664525 AW196795 AI143619 AI565152 AA025406 

AA505846 AI685494 AA829964 N59156 N59163 R15442 AA826919 AI610221 AK00120 AA603279 AW150822 AI189513 

AI807122 AI016368 AI335868 AW583389 AI193892 AI956157 A1628879 AW591589 AW583446 AI955406 AW148396 

AI340255 AI867942 AA748525 AA876991 Z38516 AI874002 AI869474 N63100 AA429094 AA082443 

AW105663 AA693880 AW51 7398 AI768507 BE220851 AW978538 AA331489 

BE219300 BE327455 AL134620 R36741 R17996 

AU031709 A1249061 AA907658 A1420444 
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316141 423880.2 AW303457 AA972713 AA724265 

323371 117336J N45114N51465BE08733BAI083551 AL135118BE395609 

307700 30923 1 1 BE280998 BE254670 BE2949S1 BE564979 AW405364 AA069256 AA128837 A1553S87 BE281405 AW4108S0 BE041153 

AJ254811 AW301340 A181333S AW301411 A1609469 AI611807 AI611616 A077823 AB35509 AI613S44BE043165 A1371663 
AI3404S2 AI612066 AW072890 A1254558 AB49884 AB70095 A1613383 A161 1 948 AI61 3353 AI307414 A1318229 AI61268S 
AW305327 AW26B924 AI370063 A1349292 BE049068 AI369098 AW274098 AI344845 AW075187 AI053401 AI345220 
BE138S15 AIS13386 AIS83302 AW30195S AI349661 AI307432 AKJ54168 AI223913 AB12081 AB48942 AI334539 AB09366 
A1370098 AB52S60 AWD86316 AW268911 AW073482 AB79802 AE24284 AW53661 AI334538 AI309369 AB09888 AI310Q23 
AI492709 AB35418 A1053999 AI36S989 AW073478 A1247058 AB49584 AI305875 A1308535 AW071272 AI271487 AB40719 
AI366995 AK23673 AW271066 Affil 1936 AW071296 AB70796 AI2S4385 AI251393 At252562 AW268235 AE54858 
AW071317 AI309102 AI609897 AW268971 AI583267 AI792484 AW075168 BE13B443 AB54126 AB09822 AI310872 
AI61 1953 AI251054 AW276658 AI335405 AW075039 A1311768 AI612028 AW271895 AI612005 AI312240 AW271032 
AI371642 AI334879 AB10194 AI310772 AJ345419 AB34875 AK23914 A1284707 AJ284813 AI349140 AB54853 AB13094 
. AI310170 AI309499 AI312476 A1376484 A1335467 A1340802 AB09815 AI3101 68 AI61 1446 A1345824 BE327775 AI318545 
F17185AW614950 

308362 732518J AW938989 AI613519 

307783 697809 1 A1347274 AW844024 

301161 427238 1 AA731518 AA765714 

324094 270098J BE395109 AW663898 AW237041 A1492154 BE046906 AI651285 AI983290 AW002590 AE01040 F32424 AA992272 
AW271836 

309023 4737 1 AF180681 NMJD15313 AA229509 AA225792 AA216413 AI888045 BE005205 AB002380 155518 BE276097 AW380669 

BE142836 AW370976 AA479334 R96425 AI680999 AA595138 H54582 AI022709 T55440 AI041769 AA861 144 AW3S2028 
AA479287 AA824634 AI638446 H54691 R96382 AA770352 AI640467 AW293491 AA779138 R28298 AA970562 C15590 
R84455 AA020769 AL036394 H80566 BE548861 AA301207 AW959414 AK84253 AA043173 W52429 BE544571 R24852 ■ 
Z42603 F13120 R24340 R24326 T75305 H70110 N56255 AA334210 F11453 AW947285 H80345 AA298992 AW380931 
AE67175Z45421 AW380981 W861 13 AA663590 AA1 67577 BE566760 BE169166 AA449904 AA4S9205 N31 126 W03564 
N31208 AW993277 N44765 AW605275 061449 W68572 AA258190 D60496 AW992964 U46277 H04097 AA370360 
AW957211 AA159775 AI631243 H83367 H21671 D61077 AW392712 N21 1 12 H98522 N45298 N83629 AI393509 AW022043 
AA744886 AI580482 AA723286 AI422244 AI423984 062804 A1088349 AA587890 AI144172 N33275 BE074397 H03399 
D62578 AI056639 AI829918 AA579584 AI089460 AI350124 W68573 AB80828 H98897 AB70468 H83715 W861 14 AA923123 
D57446 AA043174 AW337721 A1256551 AI140017 AW022356 D78855 079650 D79393 D60495 AA788666 AA693443 
AW516977 W60139 AI628156 AW473223 AI608892 AA159670 AW440366 A1421529 T50751 AI174374 AA912234 AA724248 
AW780400 AA907218 H80514 D57452 AA863419 AA552618 D29614 R44556 T16452 R44935 Z41132 D29188 H69692 
AI250176 AI078860 AA370359 AW183108 H74200 AA258183 F10723 C00323 R86148 AA860570 AW130073 AL079946 
AA410327 AA532614 AA234500 AI151507 AM10288 AW969839 AA483232 AI383200 AA236540 AI807672 H73441 

323473 193878J AA262442AA768862 AA262443 

315639 392767J AA827650 AA827652 AW629526 BE044585 AW974451 AA761439 AA648505 AA765803 
322878 117013 1 AA081820AA082191 AA079811 
301239 457668J AA807558 AA8271 17 AW629567 

301256 16720 1 NM016603AF251038 AI124624AA776579 AW298470AC04868AW082724AJ348442BE218336N20641 AI018013 

AW856832 AW978157 AA815187 AA932943 AF157316 AI444958 W00848 W02935 AI434933 N26335 AA428681 AW371059 
AK51612AW134937AW958911 AA488815AL157523W48766AW936954 AW936941 AW579205 AW936888 AW936889 
N74541 AW936953 AW578421 AW604352 AW367088 AW849258 AW849453 AW371606 AI554921 W49785 H99814 
AA805957 AA904606 AW206696 BE169229 AA333951 AA1907O4 AW936944 AA463219 AA430306 AW805704 N48503 
BE222307 AI638612 BE550045 A1305304 AI690987 AA776841 H12690AW183731 AI380760AI636261 AA812641 
AW592656 A1685132 AA843424 H99220 AW084996 AW128879 AI800871 AA610135 AA191524 AI150076 AM74530 
AA748461 N29013 AA746372 N59606 

N75450 AA877636 AW137945 W05248 AA514763 AW972399 AI758397 AW195051 
AW402931 BE393099 
AUB6947T93676TB5475 

AA641735 AA2818B1 AA861209 AA934756 AA8358B7 AA641795 AA748822 AW295703 
AW467388AA826954 

AF16871 1 AA099732 BE019157 AI380212 BE298159 AA249097 AA3051 12 AW962349 AW962353 AW401801 BE292961 
AI439469 AA442919 AI630537 AA724473 AI814288 AW966815 AI376871 AI860202 AI683132 AA099733 AW627633 
AI754022 BE206347 AW183349 A078222 BE178926 A1473282 W52944 AW752469 AW966817 
AA301270 AA301379 AA301366 

R85652 AA114024 AA296219 AA375304 AW963796 AW885952 AW020969 AA114025 AI804930 BE350971 AI765355 
AW317067 AW974763 H85930 AW172600 AI310231 AW612019 D62908 D62864 AA652738 AI674617 AI494064 AW138666 
AI147620 AI147629 AW61 1793 AI668922 AI971005 AI864742 AA174171 

AKD01701 AA134337 AA356202 BE163251 AW875175 AW875181 AW875177 BE163389 AK000741 AA247755 AA120819 
AW866040 AA309118 AW962348 AM71267 AW996843 AK001452 BE005344 BE617899 AA186588 AA120820 AW36331 1 
AA648105 N71529 BE168417 AWS73900 A1858160 AA1 34338 AA659697 N22162 AI335437 AI31 1237 AQ43171 AI336661 
AW268074 AW274348 AA935005 AW576295 AW252626 AW593153 AA730055 AA662650 AA782687 AW894855 AB33533 
AW1 93002 AW899448 AW890142 AW812670 AA085664 AA334191 BE178085 BE180553 AA389680 AA984772 AA442527 
W26560 BE384359 AA847210 AW304931 A1669606 AA085613 AW197240 AK32828 AA581646 AW129348 AI017643 
AW089030 020893 AI382955 A1557148 AW499979 
324231 975669 1 W60827 AUJ79968 AL047234 
324248 977901 1 AW504918 N55410 AL1 1 8584 AW839266 

323691 221757 J AA317561 AI793000AWB35111 AI793178AA767397AE63113AA719462 
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300611 337193J 
324157 247225J 
323509 967739J 
323514 197787_1 
300674 466093J 
322932 39838J 



323591 209807J 
322950 10774J 



322957 29014J 
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315858 
301431 
324303 
324330 
300315 
324349 
323715 
309314 
323758 
309375 



325031 
325045 
324473 
323827 
302270 
301618 
301646 



406384 1 

569736J 

233342J 

300543J 

41537.2 

1154015.1 

225129.1 

23273_-3 

229624.1 

127_1 



266373.2 

1534945.1 

38795 1 

235506.1 

1734192.1 

10987_5 

42154 1 



323923 249295J 

324580 328264.1 

316774 463723.1 

309577 6483.6 

302345 29533.1 

302358 1064753.1 

324614 215437.1 

3246S1 385257.1 

324685 41003.1 



324692 351987.1 

316893 473541.1 

303027 21796.1 

324715 290035.2 

324771 385085.1 

324783 389815.1 

303114 37417.1 

303124 21112.1 



302552 82290J 
301918 316229.1 
303232 20474.1 



302696 33570.1 

302697 43219.1 
309917 57485J 
303347 192210.1 
303349 193138.1 
310599 690880.1 



AA737345 AA682288 AI799378 

R05385AI061251 

AL1 18754 AA333202 H38001 

AAS84766 AW974271 AA592975 AA447312 

BE152396 BE152395 AA287515 BE001834 AA286678 AW406477 

AW501470 AW502931 AW499500 

AA322155 AA326396 AA326538 

AW009312 

AA833858 AW978090 AA327879 AA810436 

AF286598 AW075342 AB028994 AL043713 AW378914 AA340650 N57166 AW956914 R17961 AA336481 BE393734 
AW977867 AW294638 AA927857 AA961627 AW303969 AW894416 AA812119 AA912758AA424355AA490582W30941 
AA476693 AA131029 AA127777 AUJ43714 AA496984 T51117 AA127722 AA594012 AI492876 N76483 AW119061 BE464926 
AW303419 A1972370 AI768172 AI826550 AI435432 AI379516 AA778421 AI276089 AA424521 N59361 AA723153 AA723176 
AI867487 AA090677 AIB27221 AB51027 W02732 AB10729 AA142848 AI0821 10 N59379 N29744 AB83747 All 48665 
AW779845 AI3829S7 F34319 AI369934 AI282438 AW183449 AA863467 AA813469 AI092645 AI870701 AA863119 
T65475 R07576 T17017 RJ8143Z43546 
T08845 Z43538F06691 

BE560824 BE513941 AW238907 AA5B0852 AW501176 BE241846 AW501163 AW751433 AW501340 BE241715 AI910774 
AW406878 AW966560 AW966151 AW966496 AA336174 AA335376 AA335537 
R56151W91936 
T52761 T52760 

AJ277841 AI630669AI804370Z41939 AW751251 AA299456Z44739AW860471 Z30158AW1 05391 H56997WB4688 
AA491201 WB4636 AA706815 AI131055 AA483636 AI005075 AW340034 AI332372 AW118195 AI338932 AH91968 
AA693932 AI189982 AI193225 AA884163 AA594562 W37747 AA249754 AA746131 A1916540 AI832188 AW946555 
AA833838 Z40564 AA861563 F01447 AA887937 AI933559 AW973250 AA566018 AA313954 
AA354146 AI184230 AA643525 
AA492588 AA492498 AA492571 
AA814859 AA814857 AI582623 
AW902251 AW168753 

X12830NM 000565 AW503691 X58298S72848 AA1 93347 AW503481 AW177946 AW1 78192 AW178188 AA285233 
AA410577 AA193465 AW177939 AW365459 BE221693 
AW207734 060164 D81 150 D81078 061356 AW99S804 
AW503101 AA309184 N56323 R70998 
AW504161 AW503601 AW505509 

AF226667 AA207032 AA1 00804 AA121287 AA488316 AI808218 AW419048 AI91 1097 AW132123 AA502311 AW089948 
AA100952 AI075431 AW083432 AI990554 BE466029 F28843 AF0B6422 W79581 AW439007 F37179 W79780 AW439035 
AA731381 AW75O380 AA251012 AW589848 M730238 AA329792 AW087255 AA220982 AA082469 AA877260 AA232380 
BE298910 

AA557952 AA677593 AA618150 
AW979189 AA837332 AA856946 AA876935 

AF111178 NM.005708 AF105267 AW590040 AI979280 AA001322 BE146329 AA702430 AA702429 AAS94221 AI206348 
AI206285 AW770197 AA923032 AI379586 AA701165 AW594643 AA001909 AW0O2368 

AI739168 AA426249 AI199636 AW505198 AW977291 AA824583 AA883419 AA724079 AI015524 AI377728 AW293682 
AI928140 AA731438 AI092404 AI085630 AA731340 
AA631739 AA768584 AW134477 
AA640770 AI683112 AA913009 
AF090948 AIOS4898 All 1 1 1B2 

AB018257 BE148640 AA081832 AK001915 AF150217 AF161350 AI219174 AW074664 D60040 AA346065 H28750 
AW151783 BE613360 BE612628 BE502031 AW183790 AA992580 AA505815 AB10432 AI678015 AW592679 AA879181 
AA806708 AI744110 H24681 C16064 062909 A1285033 AA346064 AI865123 AW467798 BE221231 AL120676 N89877 
AI928370 AB58387 AA748486 AV647478 AV647460 AA312313 AI279340 AW505099 
AA005122H49792 
AA476777 T86049 

AA437414 AA131479 AA086182 AB037775 AW161063 AW514393 AA332331 AW136197 BE150789 AA425533 AA249605 
N88308 AI016201 BE004662 AA291027 R575B7 AA424277 AA476391 W07532 T97036 AA218898 AW1 62629 R57770 
W01278 W90204 W90156 AL119197 R84513 AA280103 AA334994 AW965504 AA460868 AA447470 AW1 38594 W38898 
W90O28 AI078353 W90O78 AA699696 N3S523 AA704225 AA035059 AW134892 AA1 15140 A1U2854 H90084 AA826342 
AA460694 N46339 AA425344 N56953 AA035569 AI761083 AI658696 AI524818 AI338965 AW069249 AW299871 BE464061 
AI189720 AW340682 AI423380 A1275122 H17532 N80735 AA826343 AI039694 BE328398 AI192947 AW271286 A1623122 
AI922902AVV293087 N22141 AA730657AW316610N26473 F06663 Z43610H14783R59761 H11540AI265915AI681773 
AI091748 BE220636 AW841861 AI702181 AM68447 AA907544 AE73941 AW244034 R37769 AA446663 T96929 BE045884 
AA476341 H89994 H29043 AW051211 N49522 AA306977 

AK000738 AA347452 AW981713 H70832 AI750643 AA362887 AW955588 W44974 AA279599 AW298762 AA452666 

AA443355 A1337273 AA446931 AI752977 AA661554 W42674 AI292172 R41 163 AA621381 AE44157 

AJ001409AJ001410 

AW340014 AW866993 AV651649 

AA258033AA459485 

AA382661 AW958642 AA259088 

AW300144 AI338491 AI798381 BE220076 
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303388 869Z32J AL039604AUB9497 

302761 45074.1 AW250553 U07876 Z36843 R30693 A1190097 AW965317 

318455 606341J AI148763 AI903763 AI903753 A1903762 W903800 AI903801 

317850 363835 1 AI681545 A1951714 AI570397 AW873588 AA836396 AB59986 AI499790 AA773477 A1951615 T07547 AW304709 AF1 14041 

BE176629 Z44580 T30422 T32690 AW353065 H10602 
303431 32082_1 NM.000539 AA019013 AA019387 AA056154 H38735 AA057003 AA021051 H38102 AA015774 AA059291 AA019439 H84843 

H83375 AA019914 AA017288 R84449 W26519 H3B258 AA018736 H84147 AA018577 AA059353 U49742 H38767 AA318341 
AA317553 H86646 H91989 AA317398 AA317378 W29024 W23034 T27877 AW950059 AA017195 R84262 AA057177 
H89941 AA019904 H84662 AA015775 AA019368 AA020976 H37B00 C20733 H38682 H85197 AA018578 AA017252 
AA019440 AA059059 H38651 H84148 AA018560 W25754C20752 AA317815 AW952115AA317369AA019845 R85402 
AA019492 AA017196 AA056093 AA056094 AA058836 AA055155 W25957 W23027 AA056159 W23043 W21890 WZ8951 
AA317978 W26459 AA317265 
N49476Z45911R21061 
AA331906AA332484 

AK001952 AA336839 AW249Z71 BEZ47287 AF1BZ002 BE613472 AW9B2673 AA332235 AW849937 AW849814 H49893 
AA477148 AW968944 AF182003 AW007897 BE246145 W761O0 A1480141 AW410205AA609339 AE09111 AWO0G979 
AA330280 AW961554 W72865 H49894 AA514317 AA620407 AA504522 AW472833 AA716609 AW129282 AA347351 
AA628378 AW5898S0 AI636696 AA464632 AA464533 AW874189 AA757076 AA479654 AW517910 AW292357 AW872638 
AW26228B AI910666 AW513749 AW238771 AA215797 BE387073 

BE143533AW850432 AK000042AA333666 AA385314 AW966616 AW793068 AW793414AA361103 AW390841 AA04OO95 
AW385058 AW799162 AB83115 AI990745 AI653703 BE503S93 AW150758 AI949919 AW190450 AW512348 At625970 
AW501057 N52954 AI281378 AI401710 AI648409 AW002659 AI687639 AM93943 B33960 AA04O062 A1926267 AI240425 
AI520911 AI093428R52943 

303488 36085 1 A1040372 AB0409I5 W40569 BE158910 BE158914 D63226 AW025860 AW583088 AA334307 AA210942 AW753212 

AW805322 AA382635 BE15891 1 AW891225 AW994862 AA805451 R28541 AA229347 N48266 A1377788 R28682 R36122 
AA811941 AK40742 AI632001 TS9965 W01976 AW891205 AW891177 T97433 C15571 AA346850AA504293W07500 
AI694503 AA489Z16 AA327725 AW959917 AA694146 N68514 AW76285 AW016246 T07783 AA642400 AA716133 AAB05332 
R00312 AA705021 AW498605 AWB91723 AW891906 AA808025 N29039 N74897 W60393 AA810184 AI627460 AW057516 
AA807436 AA760968 AB59295 N78S42 N20662 AA830300 W81705 AA832258 AW891718 AI811786 AW515523 Z41735 
AA449978 AW891714 AI684539 AW891898 AW071701 AIB90916 AI924994 AI039743 AA888524 AA244214 AI01573S 
AI270105A1865077 

F30712 F35665 AW263888 AIS04O14 AB04018 AA336927 AA336502 
H08370 Z46168 F07366AA183168AA193138 

AK000290 AI476034 AA465309 BE148761 AW303607 AW958665 AW469635 AI819365 AB43857 AW469326 AA157110 
AA278626 AA496257 AA306656 F29732 AA831 859 AA312210 AA564476 AA579065 AA769522 AA740386AE05635 
AA491643 AA810400 AA417708 A1567332 AA157392 N53817 AA374229 
R68545 T271 19 R25687 AW750672 
H13364 127135 R61679 AA746905 r 
H77679 

AB038995 NM_016530AK001 1 1 1 AA465635 AW968716 U66624 AA885459 AA703019 AB40266 AI018689 AB92886 
AI125372 A1376796 AI192040 N58161 AL133607 AW503873 AW505479 AA362265 AJ404671 
F11623H17552AA347728 

BE311816 AK000916 AW868037 AW8S8039 AF228527 AI752482 AW868041 AA077049 AE01537 W55873 AA206019 
AA077918 AW968729 AB78S28 AW139820 AI093053 AW204025 AI418805 AA598926 AA586345 AA045669 BE314455 
AA045668 

W01 166 AW996900 BE184300 Z44887 T34535 R51495 AW886575 AA295490 AA295162 AA295163 AW937125 T56951 
BE386106W52674 

AW50010S BE241915 AW503971 NM.016542 ABO40O57 AA313812 AK000556 W16504 AI822088 AA259107 AA191319 
BE085957 AA309584 BE122687 AW952435 TB4469 BE0S8194 BE088132 AA328562 BE092674 AA263102 T39634 
AW992380 R79391 R24392 H03060 AW675066 AI299952 AW020325 D25953 N75199 AA361425 AW612302 AW236333 
AW873897 AW953686 N22323 AA649166 AB77099 H03061 A1660072 AW276405 AA809779 A1803430 AW297484 
AW510384 AA814816 AA371522 D63035 AA953567 R79392 R24282 AA876831 AW297542 AI699023 AA992652 AI041436 
A1631602 AW589676 Z28684 Z24981 
Z32887 BE349923 AA398215 AA399231 
AW501336AW501337 
AA236027BE003275 

AA195509 BE394661 AV660757 AA489161 BE165972 AW503705 AA262785 AF123320 Z78357 NM.014171 AF161488 
AA248971 BE568575 AA461410 AA165108 AI637731 H75454 AA372934 AW339334 BE568754 BE564697 BE567299 
AI681606 BE537269 AW197204 AA290890 AI189393 AW292463 AW470227 F27399 AW61 1942 BE566888 AW301701 
A1675761 AI828429 AA164711 AI797753 AI856879 AI912690 A1675277 A1695099 AI094095 AW014158 BE091059 AE01748 
AW236961 AI038003 AIO83606 AA401606 AI079405 AI073516 AI655537 AA401475 AB14532 AI079862 AI093789 AM22084 
A1216476 AI392760 AA926998 AA781782Z25198 AI086377 A1185511 All 85539 228843 AE23792 A1379563 AA706253 
AM33798 AI92188S H75455 AW025269 AI224100 AI08361 1 AK25057 AW1 96334 AI572254 AA761628 AI472801 AA283784 
303751 468554 1 AA8301 49 AW978407 M85983 AW503637 

319401 1323199 1 W00973 N56457 AW992226 TB4921 R01342 

319402 1003489.1 R86913 R86901 H25352 R01370 H43764 AW044451 W21298 
318807 1536467 1 F08434Z42573H28810 

319478 765461 T A1524124 R06B41 R06842 
318872 1534581 1 Z43108 F06295 R13085 



303494 236389.1 
319142 164820.1 
302868 12593.1 



318518 1205335.1 

318519 434741.1 
304168 72494.-10 
302948 21445.1 

319250 244351.1 

318644 17700.1 



318674 204968.1 
304232 20640.2 
303685 8088 1 



318704 799152.1 

318730 275116.1 

303714 1155758.1 

304387 183612.1 

304398 10169.1 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



318885 94880_2 AA742999 243272 AA345258 AW956677 AA031942 

303841 79133.1 W19657 BE816760 BE259848 BE382680 BE615587 AB34464 AA32274S T07155 AW961174 AA307302 Z41888 AA521992 

AA188400 AW770S08 A1147458 AI148408 A1B9S291 AA972S91 
303889 1777183J T19204T38109 T36107 

319539 63198 1 R09QZ7 AA344892 AA329574 AW955648 AW978708 A1567804 A137893S AW014557 AI804134 R08922 N92947 BE546788 

318905 1536408J F08365Z43395R54298 

320187 396254J T99949 AA654769 AA664550 AW975264 

318996 65715.1 Z44266H06384AV655948 

319835 163534J R17531 AW960899 AA338366 AWB73294 BE047729 BE047722 AA330746 AW8417S7 H05030 AI142105 R12654 
319699 747196J AI458582 H24240 R14537 R18426 AW867082 
319713 1699356J R24204R15712T84695 

319761 75324.2 AW630974BE005208R84237AA724997AA334867AVV955777R18816 

319764 88598J AA019827R18947 H46852 

319808 7069J3 T58960 AA609180 AA621130 AI927238 AM31075 

321040 193331J AA261830 AW967B55 H26953 AA262478 

320409 43709 1 AA2268S9 AA296516 AW959753 AA18B390 AL359619 AA3S6195 M148427 R22748 A1033624 BB488S3 H95327 

AW579751 BE561649 AA397533 BE617136 AA236444 T89946 AA247450 N55777 W38725 AI743848 AI808406 AA922229 
AI051464 W04713 R11251 W19656 AI042319 AA48S276 AE24533 K95274 AW269958 18931 1 A1890088 AI862754 
AI830968AK>69338AI5897B0AA534557AV\I273839AI338155AI1268^ 

AW167978 AA976930 M148428 AI289304 AB24262 AB25961 AA773469 AE22288 AB80054 AE42371 AA227222 
AA973329 AA296517 AA829436 AA234526 AI149769 AI567865 AA936939 AB90681 AW469308 AI6S9531 AA486419 
AI422051 AI057252 AA626941 AI475352 AW247913 AI222370 AA670122 AW198034 AA486418 AI353794 AA380739 



319881 1585983.1 


H51299 H44619 H46391 RB6024 H51892 T72744 


320488 368458 1 


AI817338 R32883 AA595590 A1743065 R31386 


321121 1545647 1 


W23285 H42714 P25381 F37215 


321205 81249 1 


AA002047 N72537 H54142 H81580 


321253 375160 1 


AA610649 AKS9484 M59558 


314043 155125J 


AA827082AA732246AA167611 AA830741 


320830 17685_? 


AA199847 AA410224 R53323 AW936S67 AW936569 AW936568 AW936571 


313435 443527J 


AA769123 AA831715 AW977666 W92553 


313443 82292 T 


AA005125 W95019 W93335 AA249037 


313472 82811 1 


AA007374 AA007466 AB16886 


321348 41762_1 


Z49979 D61703 U30168 


314138 179960J 


AA740616 AA654854 AA229323 


320712 57156.2 


R668S7 R65678 R82673 W73128 R83101 


321383 41924.1 


AW968556 AJ238555 AW968731 AJ002574 AA459446 H70260 AW977557 AA767351 




AI300460 AA907450 AA649224 T07415 AB36896 BE018515 AK79865 BE047421 


312996 187327.1 


AW368634 AI702169 AE45179 AW368646 BE545574 AA249018 AW368633 N27553 


306513 


AA989230 


306537 


AA991705 


306557 


AA994530 


306598 


A1000320 


306620 


AI000929 


306700 


AI022056 


308078 


AM72621 


306813 


AI066544 


306830 


AI075803 


306855 


AI083982 


329722 c14_p2 




329728 c14_p2 




306890 


AI092235 


308100 


AI475949 


308147 


AI498991 


306929 


AI124514 


308352 


AI610791 


308383 


AI624497 


308521 


AI689808 


308561 


AI701559 


308617 


AI738720 


308771 


AI809301 


308828 


AI824829 


308896 


AI858667 


303019 41850.1 


AF098363AF098365 


303084 44211 1 


AF174008 AF174027 AF174106 


305092 AA642912 




305169 


AA663131 


305177 


AA663591 


305235 


AA670460 


305413 


AA724659 
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305849 


AA861571 


305854 


AA862733 


307113 


AI183S86 


307130 


- AI185234 


305337 


AA883238 


305977 


AA887293 


307451 


A1248615 


307513 


AI274307 


307848 


AI364188 


307871 


AI36B665 


307881 


A1370434 


307832 


AJ230B22 


307944 


AM18246 


307954 


AI419692 


307965 


AM21841 


309245 


AI972447 


309271 


A 988221 


309365 


AW072861 


309372 


AW074330 


309435 


AW090537 


309508 


AW1 37700 


309536 


AW151933 


309709 


AW242630 


325417 Cl2_hs 




325450 C12J1S 




325452 C12JIS 




309815 


AW292760 


309839 


AW296076 


309849 


AW297444 


309906 


AW339340 


302705 31765J 


U09060 U09061 


304037 


110400 


304039 


T47349 


304236 


W93Z78 


304257 


AA053294 


304382 


AA232873 


304405 


AA282572 


304561 


AA489792 


304569 


AA490934 


304787 


AA582678 


' 304921 


AA803092 


327819 Qj5Jts 




304968 


AA614308 


306382 


AA968967 


331263 47479J 


AW780192 AA015718 W02571 


332252 1663967J 


N63882T91174 
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TABLE 14B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 14. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Ptey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 dlgB numbers In Bib column are Qenbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which axons were predicted. 

NLposlBoru Indicates nudeofide positions ol predicted axons. 



Pkey Ref 

332807 Dunham, I. elaL 

332808 Dunham, I. etaL 
332812 Dunham, L elal. 
332901 Dunham, I. etaL 
333149 Dunham, I. eLaL 
333916 Dunham, L eLaL 
334026 Dunham,!. eLaL 
334061 Dunham, I. elal. 
334073 Dunham, I. eLal. 
334150 Dunham, I. eLaL 
334379 Dunham, I. eLaL 
334719 Dunham, I. eLaL 
334773 Dunham, I. eLaL 
334893 Dunham, I. eLal. 
334935 Dunham, I. elal. 
335146 Dunham, I. eLal. 
335320 Dunham, I. eLaL 
335568 Dunham, I. eLaL 
335586 Dunham, I. eLaL 
335601 Dunham, I. eLaL 
338036 Dunham, I. eLaL 
336123 Dunham, L eLaL 
336268 Dunham, I. eLaL 
337173 Dunham, I. eLaL 
337460 Dunham,!. eLal. 
337685 Dunham, I. elal. 
337736 Dunham, I. elal. 
337780 Dunham, I. elal. 
337965 Dunham, L elal. 
337976 Dunham, I. elaL 
338030 Dunham, I. elaL 
338112 Dunham, L elaL 
338165 Dunham, I. elal. 
338178 Dunham, I. elal 
338427 Dunham, I. elal. 
338506 Dunham, I. elal. 
338794 Dunham, I. elal. 
338910 Dunham, I. elal. 
339047 Dunham, I. elal. 
332864 Dunham, I. elal. 
332933 Dunham, I. elal 
333193 Dunham, I. elal. 
333712 Dunham, I. elaL 
333940 Dunham, I. elal 
333942 Dunham, I. eLaL 
334287 Dunham, I. eLaL 
334387 Dunham, I. elaL 
334487 Dunham, L elaL 
334913 Dunham, I. elaL 
335109 Dunham, L eLaL 
335250 Dunham, I. eLaL 



Strand 


Nt_posiBon 


Plus 


297686-297808 


Plus 


298277-298360 


Plus 


309688-310561 


Plus 


1841954-1842090 


Plus 


3574317-3574413 


Phis 


82989944299169 


Plus 


9196549-9196681 


Plus 


9686941-9687077 


Plus 


9792201-9792374 


Phis 


10529221-10529854 


Plus 


13908356-13908467 


Plus 


15778859-15779026 


Plus 


16235169-16235328 


Plus 


19302753-19302881 


Plus 


20108247-20108373 


Plus 


21491292-21491457 


Plus 


22542132-22542246 


Plus 


24935021-24935655 


Plus 


24990333-24990497 


Phis 


25044923-25045157 


Plus 


29019796-29019877 


Plus 


30051089-30051186 


Plus 


31997555-31998040 


Plus 


23624127-23624224 


Plus 


32536159-32536395 


Plus 


3547161-3547245 


Plus 


3850500-3850643 


Phis 


41137934113990 


Plus 


7034267-7034392 


Phis 


7166011-7166119 


Phis 


80727084072827 


Plus 


10391398-10391600 


Plus 


12205719-12205875 


Plus 


12800037-12800181 


Plus 


19685043-19685354 


Pius 


21221871-21221953 


Plus 


27114697-27114763 


Plus 


28795375-28795551 


Plus 


3076079340760988 


Minus 


1390386-1390296 


Minus 


2035790-2035681 


Minus 


3832993-3832494 


Minus 


7286177-7286073 


Minus 


85238304523671 


Minus 


85526294552330 


Minus 


13294116-13293871 


Minus 


13946021-13945781 


Minus 


14432191-14432132 


Minus 


19463909-19463815 


Minus 


21325792-21325667 


Minus 


21952922-21952826 
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335288 Dunham, Letai 
335290 Dunham, LeLal 
335549 Dunham, l.elal 
335862 Dunham, LeLal 
5 335864 Dunham, LeLal 
335905 Dunham, I. eLaL 
336205 Dunham, I. eLaL 
336276 Dunham, L elaL 
336433 Dunham. LeLal. 

10 336605 Dunham, L elaL 
338616 Dunham, I. elaL 
336679 Dunham, L elaL 
337043 Dunham, I. elaL 
337272 Dunham, L elaL 

IS 337357 Dunham, L elaL 
337393 Dunham, Letal. 
337497 Dunham, LeLal. 
337646 Dunham, LetaL 
337920 Dunham, I. elaL 

20 338083 Dunham, L elaL 
338220 Dunham, LetaL 
338752 Dunham, I. elaL 
338763 Dunham, L elaL 
338983 Dunham, L elal. 

25 339209 Dunham, L elaL 
325240 5866848 
329532 3983505 
329522 3983507 
329519 3983510 

30 329511 3983514 
32S326 5866875 
325303 5866908 
325389 S866921 
325417 5866925 

35 325450 5866941 
325452 5866941 
325498 5866967 
325587 6682462 
325602 5866994 

40 325701 5867028 
325780 6381953 
329722 6065785 
329728 6065785 
329666 6272129 

45 329815 6624888 
329841 6672062 
325824 5867048 
325866 5867076 
325902 5867101 

50 325958 5867142 
326014 5867160 
329941 6165199 
330002 6623963 
328154 5867170 

55 326023 5867245 
326278 5867269 
330036 6042048 
326547 5867307 
326495 5867423 

60 326507 5867435 

326505 5867435 

326506 5867435 
326530 5867441 
326508 6682496 

65 330120 6671864 
330123 6671869 
326858 65524S2 
326983 5867657 
327014 5867664 



Minus 




Minus 




Minus 


Z4OO6203"24608UO 


Minus 


26690300^26690125 


Minus 


£OOb45j / "ZOOtttOOZ 


Minus 


269ooow-2693o719 


Minus 


1f\ ATT A CO QlVITTO* 4 

30477450-3047/31 1 


Minus 


32093320-320931 81 


Minus 


340o7o40-340o74Zo 


Minus 


15o165U9-1561o35o 


Minus 


26021 027-2ou20o4o 


Minus 


20357902035001 


Minus 


•i*M/W44A AlAtVlftCA 

1 7407330*17407251 


Minus 


28241476-28241 307 


Minus 


30905179-30906109 


Minus 


3M71747-3W71569 


Minus 


33371317-33371ZDO 


Minus 


2648689-2648632 


Minus 


6051648-6051510 


Minus 


itMAJOa A44MA4 

931 8438-931 8301 


Minus 


14166440-14166104 


Minus 


26421374-25421 135 


Minus 


266281 4o-2dq2ou09 


Minus 


lonnoo ceo on aqtao 
2890B8o5-2990870ic 


Miius 


04J(Y1rtC4 444ft4CQ4 

32492953*32492594 


Minus 


AMA4 q4cca 
32301 -32650 


Pius 


42937-43014 


Minus 


ocotc OCX CO. 

33265-35458 


Plus 


1 8407-1859 f 


Plus 


4AACC 4444E 

20965-21325 


Plus 


47726-48024 


Minus 


73556-73630 


Plus 


239072-239759 


Minus 


•MAC4C tmotjIC 

110635-110745 


Minus 




Minus 


TAJ 4 A/3 "ini(4A4 

704103-704202 


Plus 


173372-173930 


Rus 


1267Z4-1 26967 


Plus 


79122-79251 


Minus 


72936-73040 


Rus 


C4C4jI P4PT4 

03634*63873 


Minus 


4<14*F44 •M4AA4 

112713-112992 


Minus 


nme a a n/vni4 

207544-207741 


Phis 


Anon** 

98307-98448 


Minus 


68431-68720 


Minus 


40181-40331 


Minus 


A 4 A CA_ 4 4044 

42450-42633 


Minus 


94350-94628 


Minus 


127729-127842 


Pius 


CO A TT_C4 CCA 

53437-53550 


Minus 


10358-10447 


Minus 


-54319-34411 


Plus 


46097-4ol58 


Minus 


7<A4 "HTO 

7103-7179 


Din* 

rlUS 


171 70,0.171 flOA 


Plus 


75250-75903 


Plus 


117120-117216 


Minus 


623677-623870 


Plus 


11843-11930 


Minus 


13038-13111 


Minus 


88184949 


Minus 


9368-9509 


Minus 


303000403122 


Phis 


78904-79112 


Minus 


127553-127656 


Minus 


35311-35406 


Minus 


69337-69670 


Minus 


16023-16581 


Rus 


1017630-1017788 
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326330 6458782 
326920 6456782 
327058 6531865 
327061 6531965 
327075 6531965 
327120 6531970 
330126 6033735 
327157 5866841 
327183 58S7442 
327192 5867445 
327288 5867481 
327469 5867772 
327489 6004459 
327526 6381882 
327574 6867818 
327665 5867839 
327752 S8S7949 
327819 5867968 
327796 5867982 
330260 6671684 
330282 6671910 
328078 5868008 
328121 5868031 
328190 5868077 
328227 5868105 
327871 5868131 
328018 6902482 
328624 5868246 
328744 5868290 
328799 58S8316 
328291 5868363 
328329 5868375 
328369 5868388 
328385 5868395 
328397 5868397 
328412 5868405 
328538 5668485 
328656 6004473 
328638 6004473 
328903 5868514 
328960 6456775 
330320 5932415 
328993 5868536 
329081 5868602 
329089 5868614 
329109 5B68626 
329192 5868716 
329218 5868726 
329224 5868728 
329246 5868732 
329415 5868874 
329454 6868887 



Plus 


60B950-6077D5 


Minus 


4242542519 


Plus 


2384268-2384835 


Minus 


3486389-3486873 


Plus 


4Q4131A-4011431 


Minus 


6-1088 


Plus 




(YlUlUo 


AATSLStAR 
•rHKr'trW 


Plus 


OHO 1 f trtiWI 


Minus 




Plus 




Plus 


1<1554Q-145706 


Mfaii ic 

NllilUa 


577Q6-5flfl15 




97010-07123 


Plus 


OO/Of "WJ IfcU 








Q.T791-Q4A91 


Minus 


09909.09717 


PI, re 


R I Wfi7Ji t \AfVi 


Plite 
rlUS 




rUXS 


QQQO.A1 1A 
O30£"H 1 If 


Plus 


72807.77865 


Plus 


153782.153850 


PllM 

iluo 


91089-91 1R5 


Mtnitc 
IHUIUS 


91089-91949 


Minus 


POOOOJJQOOI 
O0u0<rO9££l 


Minus 


<M95A7-5d3133 


Mtmio 

IVUIIUo 


19066&-1 90836 


Plus 


138839-138722 


Minus 


Ovf / l~Ou9£d 


Minus 


1449M.1 44434 




101700.109930 








3fi9QS9-3701S5 




OHH90 ( "OHOUOJ 


Phie 


flR497.fif&10 


PflK 

riuo 


3Q14.J943 


Plus 


792616-792729 


Plus 


294618-294903 


Plus 


23625-24468 


Plus 


38547-38837 


Minus 


54458-54697 


Plus 


49160-50084 


Plus 


93368-93510 


Plus 


25805-26923 


Plus 


102168-102273 


Phis 


166936-167020 


Minus 


71408-71707 


Phis 


27422-27664 


Minus 


250541-250792 


Plus 


1011438-1011818 


Plus 


51342-51593 
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TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 

Table 15 depicts UnigenelD, UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



Pkey. Unique Eos probeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unkjene Title: Unigene gene Hie 

EosCode: . Internal Eos name 

localization: Predicted cellular localization of gene product ; 



{■key ExAccn UnigenelD Unigene Title 



EosCode Localization 



100394 084276 
100452 D87742 
101249 L33881 
101485 M24736 
101514 M28214 
101851 M94250 
102398 U42359 
102522 U53347 
102669 U71207 
103119 X63629 
103709 AA037316 
104080 AA402971 
104144 AA447439 
104691 AA011176 
105370 AA236476 
106149 AA4248B1 
106579 AA456135 
107102 AA609723 
107217 D51095 
108153 AA054237 
109014 AA156790 
109112 M169379 
109890 H04649 
110151 H18836 
112971 T17185 
113021 T23855- 
114908 AA236545 
114965 AA250737 
116393 AA599463 
116416 AA609219 
117698 N41002 
117984 N51919 
118985 N94303 
119018 N95796 
119126 R45175 
120992 AA398246 
121710 AA419011 
121913 AA428062 
122041 AA431407 
122593 AA453310 
123209 AA489711 
124526 N62096 
126399 AA128075 
126645 AI167942 
126966 R38438 
127537 AA569531 
128790 AA291725 
129109 AA491295 
129184 W26769 
129389 AA621604 



Hs.66052 CD38 antigen (p45) PBC1 
Hs.241552 KIAA0268 protein PAB7 
Hs.1904 protein kinase C. lota OAA1 

setecfin E(endothe£a! adhesion molecui ACC5 
Hs.123072 RAB3B, member RAS oncogene famSy PFJ2 
Hs.82045 midklna (neurita growth-promoting factor LBH9 

gbfluman N33 protein form 1 (N33) gene, PDG3 
Hs.183556 solute carrier family 1 (neutral amino a PFJ4 
Hs.29279 eyes absent (Drosophia) homotog 2 LEM9 
H&2877 cadherin 3, type 1, P-cadhertn (placenta LBG2 
Hs.13804 hypotheflcal protein dJ462023.2 PD06 
H&57771 kaffikrein 1 1 PBA6 
Hs.183390 hypothe8catprotehaJ13590 PDM3 
Hsj37744 Homo sapiens beta-1 adrenergic receptor PAV1 
H&22791 transmembrane protein with EGF-like and PDM9 
H&256301 hypothetioal protein MGC13170 POOS 
Hs.23023 ESTs PAA4 
Hs.30652 KIAA1344 protein PAA3 

DKFZP586E1621 protein PDG8 
Hs.40808 ESTs PBF1 
H&262036 ESTs, Weakly similar to Z223_HUMAN ZINC 
H&257924 hypothetical protein FU13782 BCU4 
Hs.20843 Horrw sapiens cOfM FLJ1 1245 lis, clone PL 
HS31608 hypothetical protein FLI20041 PAV9 
Hs.83883 transmembrane, prostata androgen Induced 
Hs.129836 WAA1028 protein PD03 
Hs.54973 cadherMe protein VR20 PFJ6 
HS.72472 ESTs BCY2 

hypoflietical protein MGC2648 PDV3 
Hs39982 ESTs OAB6 
Hs.45107 ESTs PDT9 
Hs.106778 ATPase, Ca++ transporting, type 2C, memb 
Hs£5028 ESTs, Weakly similar to B4374 gene NF2 PDM8 
HSJ278S95 Homo sapiens prostata mRNA, complete cds 
Hs.117183 ESTs PBF8 
Hs.97594 KIAA1210protBin PDG5 

prostata androgen-regulated transcript 1 PDV5 

ESTs; protease inhibitor 15 (P115) BCU7 
Hs.98732 Homo sapiens Chromosome 16 BAG clone CfT 
Hs.128749 alpha-mathylacyVCoA racemase PD01 
H&203270 ESTs, Weakly similar to ALU1 JIUMAN ALU S 
Hs.293185 ESTs,Weaklysimi1artoJC7328arninoaciPAV4 

transmembrane, prostate androgen Induced 
Hs.61635 six trar^rribrane epithelial antigen of PAA5 
H 3.182575 solute carrier fanfy 16 (H+fceptidetra PD05 
Hs.162859 ESTs . PAA6 
Hs.105700 secreted frizzed-related protein 4 BCX2 
H&108708 ealciurnftalmodulin-dependent protein kin PFJ7 
Hs.109201 CGI-86 protein PAV6 

spondln 2, extracellular matrix protein CJA5 
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plasma membrane 
not determined 
cytoplasmic 
plasma membrane" 
cytoplasmic 



plasma membrane 
cytoplasmic 
plasma membrane 



ptasma membrane 



not determined 



PDG7 

not determined 
PDG4 



CHA1 not determined 

plasma membrane 
mitochondrial 



EH 

PAJ5 not determined 
-PAB2 plasma membrane 



vesicular 

PAZ1 not determined 

PAA2 plasma membrane 
plasma membrane 
PDY4 

plasma membrane 
plasma membrane 
not determined 
secreted 

vesicular 
not determined 
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10 



15 



25 



30 



35 



40 



45 



50 



55 



60 



65 



129404 
123534 
130760 
131425 
132364 
132987 
133179 
133330 
133520 
133724 
133724 
133944 
134110 
301805 
302005 
302881 
303508 



AA172056 

R73640 Hs.1 1260 

AA128997 Hs.18953 

AA219134 t&26691 
AA031360 



ESTs 

hypothefical protein FIJ1 1264 



AA032221 

U81599 

U423S0 

X74331 

U07919 

U07919 



Hs.61835 
Hs.66731 
Hs.71119 
Hs.74519 
Ks.75746 
HS.7S74S 



ESTs 
ESTs 



303753 
20 308050 
310382 
310431 
310573 
310598 
310816 
311596 
313676 
314121 
314691 
314785 
314907 
315051 
315052 
316442 
317548 
317869 
318524 
319191 
319763 
320324 
320561 
320796 
321441 
322303 
322782 
322818 
323228 
323287 
324295 
324430 
324603 
324617 
324626 
324658 
324718 
330211 
330546 
330762 
330790 
330892 
331099 
331490 
331889 
332247 
332396 
332697 
332798 
334447 
338255 



AA045870 Hs.7780 
U41060 Hs.79135 
AB00004 Hs.142846 
A1869666 Hs.123119 
AA508353 Hs.105314 
AA34O605 Hs.105887 
D30891 Hs.19525 
AW503733 Ha9414 
AM60004 HS51608 
AI734009 Hs.127699 
AI420227 Hs.149358 
AW292180 Hs.156142 
AI33S013 Hs.140546 
AI973051 H&224965 
A1682088 Hs.79375 
AA861697 Hs.120591 
AI732100 Hs.187619 
AW207206 Hs.136319 
AI538226 Hs.32976 
AB72225 H&222886 
AW292425 

AA876910 Hs.134427 
AA760894 Hs.153023 
AI654187 Hs.195704 
AW295184 Hs.129142 
AW291511 Hs.159056 
ARJ71538 
AA460775 Hsj6295 
AF071202 Hs.139336 
NM_006953Hs.159330 
AF038966 H&31218 
AW297633 Hs.1 18498 
W07459 Hs.157601 
AA056060 Hs.202577 
AW043782 H&293616 
AF055019 H&21906 
AAS39902 Hs.104215 
A1146686 Hs.143691 
AA464018 Hs.184598 
AW016378 Hs.292934 
AA508552 Hs.195839 
AI685464 

AI694767 Hs.129179 
AI557019 Hs.1 16467 



1)31382 

AA449677 

T48536 

AA149579 

R36671 

N32912 

AA431407 

N58172 

AA340504 

T94885 



Ha299867 

Hs.15251 

Hs.122764 

Hs.91202 

Hs.14846 

Hs.291039 

Hs.88802 



plasma membrane 
nuclear 

plasma membrane 

PDT1 mitochondrial 
PDT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 



secreted 

not determined 
not determined 



PAB4 

PAB secreted 
PEE8 nuclear 
PBA7 
PAA7 
Of PM17 

homeoboxB13 PFJ5 
Putative prostata canoer tumor suppresso PDM1 
primase,polypap8de2A{58kD) PDM2 
aldehyde dehydrogenase 1 famfy, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA;cONADKFZp584A072(fr 
UV-1 protein, estrogen regulated BCR4 
hypoBietical protein PEU4 
MAD (mothers against decapentaplegic DrPBJ6 
relaidn1(H1) PBH3 
ESTs,WeaMysMartoHomologofratZ PEG4 
hypothetical protein FU22784 PBM4 
KIAA1468 protein PBY3 
hypothetical protein FU20041 PEU5 
KIAA1603 protein PCQ8 
ESTs, Weakly similar to A4601 0 X-tnked PBH1 
ESTs PEN3 
ESTs PCW3 
ESTs PET5 
holocarboxylasa synthetase (btottn-fprop PBH8 
ESTs PBY2 
ESTs PBY1 
ESTs BFF8 
guanine nudeotide binding protein 4 CB07 
ESTs. Weakly similar to TRHYJflJMAN TRICH 
ESTs PBM9 
ESTs PBJ7 
ESTs PBJ9 
ESTs PBQ8 
deoxyribonudease II beta PBQ7 
hypothetical protein FU10188 PBJ1 
prostate epHheHum-specii; Els transcr PEN1 
ESTs, Weakly similar to T17248 hypotheti PE07 
ATP-btnding cassette, subfamily C (CFTR PBH5 
uroplaWnS PEL9 
secretory carrier membrane protein 1 PBY4 
Homo sapiens LUCA-15 protein mRNA, spDe 
ESTs CBF9 
Homo sapiens cONA FU12166 fis, done MA 
ESTs PCQ7 
Homo sapiens done 24670 mRNA sequence 
ESTs, Moderately similar to SPCN_HUMAN S 
ESTs PBQ9 
Homo sapiens cONA: FU23241 fis, done C 
ESTs PBM3 
ESTs, Weakly similar to 138022 hypothefi PBH4 
gb*88f04jc1 NCLCGAP_Pr28 Homo sapiens 
Homo sapiens cDNA FU13581 fis, done PL 
small nuclear protein PRAC CBK1 

PBJ2 

guanine nudeotide binding protein 4 PEW! 

hypothetical protein PBM1 

TMPRSS2, transmembrane pretease, serine 

ESTs PBQ4 

Homo sapiens mRNA; cDNA DKFZp564D016 (Ir 

ESTs PCM 

ESTs,ModerarjtysirnSartoT14342NSD1 PBH7 

gb2a21to9^1Soaresfetaltiverspleen PBQ5 

gbliv»31a09j(1 NCLCGAPJOdll Homosapten 

transgerin2 PBQ8 

. PBH2 nudear 
PBY9 not determined 
PBY7 not determined 



plasma membrane., 
plasma membrane 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 



not determined 
PBY8 not determined 
secreted 

PBQ1 not determined 
plasma membrane 
PCI2 not determined 
PBJ5 

not determined 
PBY6 not determined 

cytoplasmic 
-PCW6 

PBJ4 plasma membrane 
nudear 
not determined 



not determined 
PEL3 plasma membrane 
plasma membrane 
PCQIcytoplasmlc 
nudear 

not determined 
nuclear 

PBJ8 not determined 
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401424 PFG2 
407122 H20276 H&31742 ESTs PEW 
408430 S79876 H&44926 cflpepfidytpepfMase IV (C026, adenosine PEZ3 
408826 AF216077 H&48376 Homo sapiens dona HB-2 mRNA sequence 
5 409262 AK000631 H&S2256 hypothetical protein FU20624 PFG1 
409361 NM_005982HsJ54416 ^ocu&shomsol»x(DresophlIa)hainoloPEVV3 
411096 U80034 Hs.68583 mitochondrial intermediate peptidase PEZ9 
413125 BE244S89 Hs.75207 glyoxalasal PFJ3 
413623 AA825721 H&246973 ESTs OBH6 

10 414422 M147224 H&337232 HomaoboxA13 PFC6 
415263 AA94B033 Hs.130853 ESTs PEZ5 
417153 X57010 Hs.81343 'coBagan, type II, alpha 1 (primary ost PFJ1 
418601 AA279490 H&86368 calmerjln PFA1 
418848 AI820961 Hs.193465 ESTs PEY4 

15 418882 NM.004996H&B9433 ATP-tMng cassette, sub-family C (CFTR OBH2 
419839 U24577 H&93304 "phosphrjpase A2, group VII (platelet-a PFH9 
421887 AW161450 Hs.109201 C6I-66 protein PFH2 
422083 NMJM1141HS.111256 "arachidonata ISBpoxyganasa, second ty PFH5 
424565 AW102723 Ha.75295 guanylata cyclase 1, soluble, a^iha 3 PFA3 

20 425071 NM.013989HS.154424 ^lelafltasa.lodothyrorilne.typeir PFH6 
425710 AFO30B80 solute carrier family, mamber 4 PFD4 

427958 AA41800O Hs.98280 potassium Intarmedlata/smafl conductance PFH1 
428819 AL135623 Hs.193914 KIAA0575 owe product PFD6 
429900 AA460421 H&30875 ESTs PEZ7 

25 429918 AW873986 Hs.119383 ESTs PEY5 
430226 BE245562 Hs2551 adrenargic hala-2-, receptor, surface PEZ4 
431217 NM_013427rte250830 RnoGTPasaacflvating protein 6 PFG6 
431716 089053 H&268012 fatty-add-Coenzyme A Ogase, long-chain PEZ1 
431992 NM.002742H&2891 protein kinase C.mu PFH4 

30 432189' AA527941 c#nh30c04.s1 NCI_CGAP_Pr3 Homo sapiens 

432244 AI669973 H&200574 ESTs PEW8 
432437 W07088 HS29368S ESTs PFQ3 
432966 AA650114 Hs.325198 ESTs PEV3 
439176 AI446444 Hs.190394 ESTs, Weakly similar to B28096 Bne-l pr PEWS 

35 440260 AI972867 Hs.7130 copIneW PEW6 
440901 AA909358 Hs.128612 ESTs PFC8 
445424 AB028945 cortactin SH3 domain-binding protein PEZ6 

446320 AF126245 Hs.14791 "acyl-Coenzyme A dehydrogenase family, m 
447210 AF035269 prwsptefldyfeerine-specfflophosrAoDpas PFH8 

40 449156 AF103907 Hs.171353 prostate canoer antigen 3, rran-coding DD PEZ8 
449625 NM-014253 odz (odd Oz/ten-m, Drasophila) homolog 1 PEZ2 

449650 AF055575 H&23838 calcium channel, voRaga^ependent, L ty PFD2 
451939 U80456 Hs27311 single-minded (Drosophaa) homolog 2 PFJ8 
451982 F13036 rfe27373 Homo sapiens mRNA; cDNA DKF2p55401763 (I 

45 452039 AI922988 ESTs PF08 

452340 NM_002202Hs505 ISL1 tanscdpfion factor, UM/homeodoma PFG4 
452784 BE463357 Hs.151258 hypothetical protein FU21062 PFC5 
452946 X9S425 HS31092 EphA5 PFH3 



mitochondrial 

plasma membrane 

PEY1 

nuclear 

nudaar 

mitochondrial 

cytoplasmic 



ER 



plasma membrane 



plasma membrane 
plasma membrane 
nuclear 



plasma membrane 
nudaar 

cytoplasm's; 
PFA2 



PFH7 



PFG9 plasma membrane 

nuclear 
cytoplasmic 
plasma membrane 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigenelD in Table 
15. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 



Pkey CAT number Accession 

116393 131S43J A1972402 AI634409 AI523716 AI799749 W44518 AI424438 AI688513 AI971048 AI686324 AW013854 AA588483 AA52811 1 AK27428 



AI582200 AI669296 AI826926 AI620526 AI669958 AI9724S8 AJ924500 AA512903 W44S17 AA3353S3 AW238997 BE300165 
BE250665 AA284195 AA523420 W52834 AI471970 AI952824 AW003S20 AW009463 AA689796 AA1 14966 AI653342 AA1 15038 
A1342150 AI092100 AI96821 1 W51994 AI8O40O5 AI201420 A1123210 AI738405 AI674964 AJ970341 AW027S00 AM93316 AI333193 
AI139353 AA599463 AI656163 AB04200 AB85321 AI990213 AI65701 1 AA650025 AI968810 AB41978 AA599839 AW592602 
AA644289 AI468578 A1565265 AB65228 BE221535 AW973052 



101485 181 13.1 AA296520 AL021940 M30640 NM.000450 M24736 M61894 AL047443 H39560 AI634S91 AA916787 AI214796 AA939085 AI150616 

AA412553 AA412545 AI051015 T27654 AA694430 
126399 17331.1 AA088767 AF224278 AA128075 AL035541 AA027926 AI761441 A1972096 AW071 693 AI742327 AI377498 A1804815 AIS40802 



AI885O01 AI921394 AA5951 15 N71820 AI921217 AW007283 AI467828 A1369306 AAS17446 A1493698 AA088701 AA126899 AI936228 
AW204238 AI039567 A192S027 BE13S909 AW452945 AW135998 AA310984 AA027860 AW073519 AI537597 AA953976 A1521341 
AW273569 AW050740 AA536113 AA559064 AI474392 AW135709 AA535181 AW572959 AA570597 AB05464 AI677810 AI587642 
AW975102 AA424310 AA4B2527 N64192 AA658276 AW8891 17 AA486591 AW889172 AI381990 AB81991 A1673419 A1990950 
AA487031 AI272934 AI150565 AA22916B AW316722 AI142707 8E22239B AA61416B AA122026 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 AE50993 BE146418 AA122Q25 



132964 94346.1 A1362575 AI805082 AW263421 AI432462 AA135870 AA031 360 AAD31604 AA298475 AA298464 

129389 21074.1 NM.012445 ABQ27468 BE407510 BE047605 AA047125 AW084003 AA149494 AA149490 AA292528 AA5705O5 AAS26186 AW006250 



AW007762 AJ341557 AI799668 AI972710 AB77966 AI962810 AI084783 AI458032 AI190971 AW148913 AA372354 AW970032 
AWD07426 AA650183 AI123203 AI122890 AE80975 W73S95 W73495 AI863238 AA374109 AA603986 AW149089 AW957523 
AI307748 AI921067 A1336463 F24537 A1380460 AB67500 AI189309 AI814701 AI766921 AW572106 AA037024 AW072576 AA578293 
AI288103 AA235434 AW450642 AA574230 AW294024 AI589229 A1580733 AW512227 AA877009 AI660255 AW188597 AA558228 
AI572782 AA658397 AI274628 AI868359 AA884573 AI264439 AA621604 AW515493 AW243333Z39737AI567038AA573997 
AA573559 AW236431 AI652B70 AI684973 AA034505 AA047126 



129404 156454J A1267700 AI720344 AA191424 AI023543 AI469833 AA1 72056 AW958465 AA172236 AW9S3397 AA355086 

107217 0836.1 AL080235 AA031750 D81382 AI480231 AI095947 AI560953 BE010721 AI870290 AA374945 AA125792 051527 051556 AI685541 



051559 AW1 1728S AA195741 AI675138 AW593439 AE01B85 T30590 AW952100 051095 AA523864 W70043 AA987586 AI421515 
AI205532 AA127069 A1337367 051535 AI453785 AW075677 AW088359 C14287 C14284 



121710 19266.1 AF163474 NM.016590 AF163475 A1761105 AI770098 AA410580 AA411616 A1590343 AI739050 AL050198 AI862645 AA419104 



AA513809 AA333032 AB16915 AW139625 AA640839 A131 1391 AI627693 AW135514 AA41901 1 AI269149 AI245259 AI970008 
AI970017 AW139445 AA569503 AI761072 AI766179 AI759995 AI300776 A1870129 AW150770 AA225501 AA226220 . 



121913 291015.1 A1249368 AI742316 AA428062 AA442089 A1664189 BE349478 A1803475 AI584Q49 BE552085 A1088609 AE64197 AI888144 AI129474 

AI307145 BE181300 AW058403 A1696838 AW748598 AA442196 AI216428 
102398 entre^U42359U42359 

315051 347217J AW292425 BE487167 AI702953 BE550961 BE222309 A1299348 A1693336 AA541708 
324626 336411.1 AI685464 AW971336 AA513S87 AA525142 

319191 16065.1 NM-012391 AF071538 AB031549 AI685592 AI745526 AA662204 AW130657 AA662164 AW971 121 AI668916 AA513274 AB91223 



AI979170 AW298436 AA639821 AI859010 AW513942 AI687669 AA662521 AA548598 AB45056 AI305374 BE043418 AI432856 
AI334840 AB79796 AI492693 AI307915 BE042082 AI307834 AI307858 A1309488 BE042210 AI435670 AB71605 AI862491 AJ284563 
A1306872 A1255044 AI254601 AI251238 AI473073 A1473042 AI432760 AI435664 AI336826 AI289365 AI369098 AB62274 AI334871 
AI349863 AE50405 A1377617 A1309895 AJ313017 AB62291 AI311936 AB78718 A1305722 AI306769 AI308888 AB34565 A1862296 
AI344230 AI435685 AI344087 AI378696 AI311209 AI435775 AI310611 AI311154 AI432289 AI431561 A1492681 AI432B67 AI335288 
A1492796 AI432769 A1310299 AI432273 AI379820 A1275319 A1435753 AI609441 A1432767 AI369100 A1311420 AB49974 AI247157 
A1334677 AE70910 AI224320 AI305608 AI334489 AI377152 AI350012 AI370086 AI335053 AI306781 AB06750 AB34849 AI334874 
AI340380 AB07876 AI305974 AI30S972 AI311521 AB34872 AI862509 AI31 14S8 AJ335051 AE89684 A1310859 AB11862 AIS62483 
AM92775 AI307906 A1492708 AI289693 AI340373 AI307910 AI31 1359 A1435653 AB34865 AI311492 A1492809 AI492690 AI431576 
AI862268 AI311879 A1303435 AI492792 AJ862512 AI275321 AI431568 AI431564 A1307885 AJ307926 AI435692 AI435778 AI310182 
AI308B94 AI492707 AI492713 AI308560 AB07829 A1343234 AI580598 AW472796 AI340918 AB10243 AI309368 AJ307920 AI289665 



column. 



Pkey: 

CAT number 
Accession: 



Unique Eos probeset Identifier number 
Gene cluster number 
Genbank accession numbers 
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A1306777 AW08631 8 AW086292 AWD86378 AB10027 AK75293 A1369082 A1340900 AI306749 A1371558 AW086237 BE043803 
AI306793 AB08272 AI287948 AI270917 At284816 AI335813 A1284546 A1308044 AI275290 AI270372 A1306795 AI289687 AE23570 
A1305303 AB89677 AI287742 AE75284 AQ06S12 AB35701 AI371554 AB78719 A1344988 AK23831 AI33S141 AB43222 AI284568 
AI305357 AK75270 AI345932 AI43S549 AB07925 A1311502 A1344238 AB43182 AI308508AI305988AI270790AI379792AI305647 
AI305410AW322S1 AM38517A1343227AI305534AI3403B7AI2n043AI305489AI271046AI^^ 
AI310848 AI305848 AI289362 AI2S29S4 AI307049 A1310831 AI306993 AB08796 AI224659 AI305969 A1349855 AB06164 AI306948 
AI284676 AI309155 AI343202 AI432785 AB06815 AB69081 AI270385 AB89699 AM35704 AI309547 AI305716 AB11281 AI287927 
AI472995A1340423AI270958AI307069A1305364A!270807A!2753re AM32861 AI2S5113 

AI305709 AW73008 AB11168 AB09711 AI377164 AE71201 AI289560AB09710AI306195AI311201 AI287741 AB71066 AI432876 
AI275281 At37S79S AI472072 AI31 1 9S7 AI306326 AI305465 AI270782 AM7301 9 A1305340 AI270922 AI305%5 AQ06462 AI254144 
AB70969 AI473012 AI305390 AE75278 AI223544 AE89692 AE50318 AB05372 AI289691 AI2S0521 AI306283 AI306B14 AB07933 
AM73160 AI432903 AB23720 AES4979 AI334882 AB06926 AE89541 AI432248 AM35722AM35698AU328S9AB10683A147317S 
AI335144 AK89467 AI436489 AI306928 AI473033 AB05763 AI307868 AB07882AI348959AU35736AM32857AM32896 AM35735 
AM32283 AW73088 AI432883 AM73081 AI432825 AI30784O AI473164 AI432885 AI473168 AI472982 AI435734 A1473060 A1473171 
AI432279 AI432882 AI334670 AM36512 A1432827 AI4328S2 AM73051 AU73077 AI435697 AE71509 AI4SZ781 AI472983 A1473018 
A1432897 AI473043 AI432871 AI436536 AI473157 A1349715 AM32777 AI47301 6 AI473158 A1340369 At307941 AI432773AI377146 
AI492791 AB70950 AI305342 AE84604 A1306269 Af23481 1 AI270811 AI289347 AI334869 AI334852 A1311759 AE50332 AI309520 
AI2B9550 AB05721 AB40870 AI270901 AI30857S AB07904 AI34071S AE70941 AI309808 AE46867 AM73014 AI307039 AI289360 
AI473069 A1492788 AI344013 AI305876 AM36510 AB40742 A1473028 At307891 BE041871 BE041268 BB042340 BE041948 
BE041783 AQ06173 AB01948 AB26972 AI275769 

338255 CH22_6856FQ_UNK_EMAC00 

330211 Q_5_p2 

332798 CH22_14FGJJ5_UNK_C4G1 .Q 
334447 CtG2_1746FG_387_7JJNK_EM 

332247 372969 1 AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW1 18292 AA579216 N58172 

332398 20265 T AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367811 AW367798 R17370 AI908947 

AA382932 R58449 H18732 AA371231 AW9S2899 AA713S30 AWB92946 R53463 H11063 AWDSB542 Z40761 BE176212 BE176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 A1078161 
BE463983 AI805213 A1761264 W94885 N94502 AI623772 AI419532 AB10302 AI634190 AW002516 AW150777 AB52312 AI367474 
AW204807 AI675502 AI337026 AW134715 BE32B451 AI123157 AI560020 AB00745 A1608631 AI248873 AA7424B4 AW051635 
H18646 A1245045 AA5071 1 1 AI640510 A1925594 M1 15747 AA143035 AA151 106 

332697 13699J X51405 NM_001 873 T1 1322 AL118886 BE328175 AW136009 BE467445 AW470313 M774852 BB04139 AW501046 AA032792 
AW389231 AA370044 R36841 AA371457 C04813 R25791 R25556 AW895B54 AW903B19 AW895671 AVI/895677 BE159723 
AW895664 AW895597 AW895595 AW895S65 AW888518 AI903724 F06081 F08503 AL1 19462 AW895730 AW888516 R2651 1 
R26489 AA334126 AA327626 N85713 AW895998 AA223622 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AA001282 AA001 138 AA551566 AA330159 AI922855 AA383512 AAQ29603 D82246 032171 T94933 H56545 AA348060 
AA176888 R96764 AW451817 AA385766 AA452618 AIS90057 AA988822 BE549928 AA1509Q1 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA369982 AW385948 AA922466 N75882 AI422070 AI361256 AI68Q224 057122 T94885 
R53268 R46713 T19071 AW796277 AA325333 F04719 FQ2334 AA358146 AA625597 AA358304 AW028099 AL1 19570 D57290 
058273 057796 N48555 AC361969 AA329457 D57225 AW024046 AA992606 AWQ22118 AW021538 AA935845 H89870 H5S546 
AW961219 AA453239 AW837541 N45521 BE218029 AA318877 AA327740 AW9S1809 T92139 D53216 052365 D53363 053312 
053116 AI547267AA678935 AW026552 AVV026418 AW190507AI927710 AW244108 050948 AW054991 AW021063 AW022511 
AA493436 AI365636 BE464751 AW149384 AA102442 AW771368 AI818251 AI126368D51049A1421542A1559467AW079779 
AW021048 AW023989 AW044214 A1458264 AAQ27274 AI620254 AW028917 BE21951 1 AA326242 N67561 A1971273 AAB78328 
D57131 AA770662 A1309299 AI796767 AA613338 W58076 AI566287 AM45573 AI880260 AA001919 AW339259 A1492610 AI49261 1 
R97692 AI301425 AA722603 D58361 AI350323 AA973928 AI431263 AA516126 AA865467 AI925177 N39443 AA001943 AE99371 
AKB2412 AA66SO90 AA583433 H89871 AA977231 AI362219 AI056096 A1270446 N67524 N22103 AW614224 AA744054 AW243622 
AI613188 AB29173 AB50243 AI362138 AA744004 AA176661 D56787 A1955625 AI393109 AI094769 AI479728 AI423107 AI955617 
AI034036 AI582196 AW264534 AM18961 AA570761 AI343538 AA650341 AA992503 AA770004 AL039666 AIB62675 AW190335 
AA610274 AW418627 BE467472 056786 T28749 A1217610 AB59556 T23523 AU040189 AA846222 AA651636 D51280 AI888986 
AI521167 AB40177 AW612815 AI625285 AA621607 AA177059 AA229768 AA829788 AI749682 AW190631 N75299 AA23Q089 
A1915632 BE069542 AA890Q20 AA528397 AA995390 BE503B60 AA570812 AW339396 AI197986 AI203725 AE82379 AA670375 
AA461513 F01728 AW243599 C00856 N7S567 R95995 AA150932-R95961 AA648060 AA933800 AA927073 AA101 126 AA864190 
T93566BE187472 

425710 25529J AF030880 NM_000441 AC002467 AA335554 H23053 AW891838 A1139968 AA653057 A1695233 
432189 342819 1 AA527941 AI810608AI620190AA635266 

445424 6391J AB028945 T77648 F13328 AL157605 Z46212 AA304736 F1 1855 T66093 T30174 AW954164 AW176301 AW748243 AA456428 
AI369958 AA938565 AW959613 242008 AA994779 AI683909 F1 1019 F10926 AI769597 AI752550 T65015 AI884314 AA643954 
Z41838 AW020147 AI038822 AW571822 AA299781 AA894928 AF131790 BE00541 1 A1902476 AW082695 AA464384 R42750 
AW902301 AA464273 R05837 Z38294 H41098 AL134507 M86079 

AF035269 AF035268 NM.015900 T96213 U37591 AA156832 AA299371 AI084325 H95977 AI765967 BE221465 AA156726 AB69563 
AW024539 AI436791 AI949451 AA843093 AI452756 AA824232 A1305587 T96131 AW207447 AW243556 AW957032 A1084332 
H95978U30998 

NM_014253 AF100772 BE088769 AL022718 BE161779 AW863569 BE161640 AUS9060 BE168542 AW296554 AA323193 AA235370 
AW779760 N48674 A1375997 R45432 D59344 AI203107 F07491 R35360 R25094 AI913631 AI498402 T61382 AI016320 N45526 
T61415AA331486 

89513 1 AH22988 H05475 AA021608 AW169947 AA913750Z41614 AW800012 



447210 
449625 
452039 



7119J 



8113J 
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TABLE 15B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
5 listed. 



Pkay: Unique number corresponding to an Eos probesat 

Ret Sequence source, Hie 7 aTgtt numbers In this column are 6enbankloantifier(G0 numbers. "Dunham LetaL* refers to the 

publication entffled the DNA 
10 sequence of human chromosome 22.' Dunham I. et aL, Nature (1999) 402^89-495. 

Strand: Indicates DNA strand from which axons were predicted. 

Ntjoation: taaTcatesraiclecfldeposffionsof predicted axons. 



Ptey Ref 


Strand 


NLposirJon 


334447 Dunham, Letal 


Phis 


14308764-14308824 


33Z798 Dunham, LetaL 


Minus 


232147-231974 


338255 Dunham, LeLal. 


Minus 


15242294-15242231 


330211 6013592 


Phis 


6915*59215 


401424 8176894 


Plus 


24223-24428 
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TABLE 11 AND SEQUENCE LISTING 

SEQIDH&1 BCU4 DMA SEQUENCE 

_ NuddcAddAccesstonf. NM.024915 

5 Coding sequence 13-1830 (undefined sequences correspond to stat and stop codons) 
1 11 21 31 41 SI 

ATTGGATCAA ACATGTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 

10 ATGCCCAGTG ACCCTCCATT CAATACCCO A AO AGCCTACA CCAGTQAGGA TOAAGCCTGG 120 
AAGTCATACT TGG AQAATCC CCTQACAOCA GCCACCAAGG CCATGATGAT CATTAATGGT 180 
GATGAGGACA GTGCTGCTJGC CCTGGGCCTO CTCTATGACT ACTACAAGGT TCCTCGAGAC 240 
AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGOA GAAAAGAAAC 300 
TGCCTTGGCA CCAGTQAAGC CCAGAGTAAT TTGAGTGGAG GAG AAAACCG AGTGCAAOTC 360 

15 CTAAAGACTG TTCCAOTQAA CCTITCCCTA AATCAAGATC ACCTGGAGAA TTCCAAGCGG 420 
GAACA0TACA GCATCAGCTT CCOCGAGAGC TCTGCCATCA TCOCGGTGTC GGGAATCACG 480 
GTGGTGAAAG CTG AAGATTT CACACCAGTT TTCATGGCCC CACCTGTGCA CTATCCCCGG 540 
GGAGATGGGG AAGAGCAACG AGTGGTTATC TTTGAACAQA CTCAGTATOA CGTGOCCTCG 600 
CTGGCCACCC ACAGCGOCTA TCTCAAAGAC QACCAGCGCA GCACTCCQOA CAGCACATAC 660 

20 AGCQ AGAGCT TCAAGG ACGC AGCCACAG AG AAATTTCGGA GTGCTTCAGT TGGGGCTGAG 720 
GAGTACATGT ATGATCAGAC ATCAAGTGGC ACATTTCAGT ACACCCTGGA AGCCACCAAA 780 
TCTCTCCGTC AGAAGCAGGG GGAGGGCCCC ATG ACCTACC TCAACAAAGQ ACAGTTCTAT 840 
GCCATAACAC TCAGCGAOAC CGGAGACAAC AAATGCTTCC GACACCCCAT CAGCAAAGTC 900 
AGGAGTGTGG TGATGGTGGT CTTCAGTGAA GACAAAAACA G AG ATGAACA GCTCAAATAC 960 

25 TGGAAATACT GGCACTCTCG GCAGCATACG GOGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 
TACAAGGAGA GCTTTAATAC GATTGGAAAC ATTQAAGAGA TTGCATATAA TGCTGTITCC 1080 
TTTACCTGGG ACGTGAATGA AGAGGCGAAG ATTTTCATCA CCGTGAATTG CTTGAGCACA 1140 
OATTICTOCT CCCAAAAAGG GGTGAAAGGA CTTOCTTTGA TGATTCAG AT TGACACATAC 1200 
AGTTATAACA ATCGTAGCAA TAAACCCATT CATAG AGCTT ATTGCCAGAT CAAGGTCTIC 1260 

30 TGTGACAAAG GAGCAGAAAQ AAAAATOCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 
GGGAAAGGOC AGGOCTOCCA AACTCAATGC AACAGCTOCT CTCATGGGAA GTTGGCTGOC 1380 
ATACCTTTAC AG AAGAAGAG TGACATCACC TACTTCAAAA CCATJG0CTGA TCTCCACTCA 1440 
CAGCCAGTTC TCTTCATACC TGATGTTCAC TTTGCAAACC TGCAGAGGAC CGGACAGGTG 1500 
TATTACAACA CGGATGATGA ACGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 

35 CCCATGGAAG AGGAGTTTGG TCCGGTGCCT TCAAAGCAGA TGAAAGAAGA AGGGACAAAG 1620 
OGAGTGCTCT TGTACGTGAQ GAAGGAGACT GACGATGTGT TOGATGCATT GATGTTGAAG 1680 
TCTCCCACAG TGATGGGCCT GATGGAAGCG ATATCTGAGA AATATGGGCT GCCCGTGGAG 1740 
AAGATAGCAA AGCTTTACAA GAAAAGCAAA AAAGGCATCT TGGTQAACAT GGATGACAAC 1800 
ATCATCGAGC ACTACTOGAA CGAGGACACC TTCATOCTCA ACATGGAGAG CATGGTGGAG 1860 

40 GGCTTCAAGG TJCACGCTCAT GGAAATCIAS CCCTGGGTTT GGCATCCGCT TIGGCTGGAG 1920 
CTCICAGTGC GTTCCTCCCT G AGAGAG ACA G AAGCCCCAG CCCCAGAACC TGG AGACCCA 1980 
TCTCCCCCAT CICACAACTG CTGTTACAAG ACCGTGCTGG GGAGTGGGGC AAGGOACAGQ 2040 
CCCCACAGTC GGTGTGCTK3 GCCCATCCAC TGGCACCTAC CAGGGAGOCG AAGCCTGAGC 2100 
CCCTCAGGAA GGTCCCTTAG GCCTGTTGGA TTCCTATTTA TTGOCCACCT TTTCCTGGAG 2160 

45 CCCAGGTCCA GGCCCGGCAG GACTCTGCAG GTCACTGCTA GCTCCAGATG AGACCGTCCA 2220 
GCGTTCCCCC TTCAAGAGAA ACACTCATOC CGAACAGCCT AAAAAATTCC CATCCCTTCT 2280 
TTCICACCCC TCCATATCTA TATCTCCCGA GTGGCTOGAC AAAATGAGCT ACG TCIG GGT 2340 
GCAOTAGTTA TAGGTGGGGC AAGAGGTGGA TGCOCACTTT CTGGTCAGAC AGCTTTAGGT 2400 
TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTCCAGGGTT CCCAGCAAGT GGCCACCAGG 2460 

50 CCTTGTACAG G AAGACATTC AGTCACCGTG TAATTAGTAA CACAGAAAGT CTGCCTGTCT 2520 
GCATTGTACA TAOTOTTTAT AATATTGTAA TAATATATTT TAOCTGTGGT ATGTGG GCAT 2 580 
GTTTACTGOC ACTGGCCTAa AGGAGACACA GACCTGGAGA COGTTTTAAT GGGGGTTTTT 2640 
GCCTCTGTGC CTGTTCAAGA GACTTGCAGG GCTAGGTAGA GGGCCTTTGG GATGTTAAGG 2700 
TGACTGCAGC TGATGCCAAG ATGGACICTG CAATGGGCAT AGCTGGGGGC TCGTICOCTG 2760 

55 TCOCCAG AGO AAGCCCCCTC TOCTTCTOCA TGGGCATGAC TCTOCTT0OA GGCCACCACG 2820 
TTTATCTCAC AATGATGTGT TTTGCCTGAC TTTCCCnTG CGCTGTCICG TGGGAAAGGT 2880 
CATTCTGTCT GAGACOGCAG CTCCTTCTCC AGCTTTCGCT GCGGGCATGG CCTGAGCTTT 2940 
CTGGAGAGCC TCTGCAGGGG GTTTGCCATC AGGGCCCTGT GGCTGGGTCT GCTGCAOAGC 3000 
TCCTTGGCTA TCAGG AG AAT CCTGGACACT GTACTOTGCC TCCCAGTTTA CAAAQACGCC 3060 

60 CTTCATCTCA AGTGGCCCTT TAAAAGGCCT GCTGCCATGT GAGAGCTGTG AACAGCTCAG 3120 
CTCTGAGTCG GCAGACTGGG GCTTCCTCCT GGGCCACCAG ATGGAAAGGG GGTATTGTTT 3180 
GCCTCACTCC TGGATGCTGC GTTTTAAGGA AGTGAGTGAG AAAGAATGTG CCAAG ATACC 3240 
TGGCTCCTGT GAAACCAGCC TCAGGAGGGA AACTGGGAGA GAOAAGCTGT GGTCTCCTGC 3300 
TACATGCCCT GGGAGCTGGA AGAGAAAAAC ACTCCCCTAA ACAATCGCAA AATG ATGAAC 3360 

65 CATCATGGGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCGTT OGCCCTTGTG 3420 
GGCTGAAGCA CTAGCTTTTT GGTAGCTAGA CACATOCTGC ACCCAAAGGT TCTCTACAAA 3480 
GGCCCAGATT TGTTTGTAAA GCACTTTGAC TCTTACCTGG AGGCCOGCTC TCTAAGGGCT 3540 
TCCTGCGCTC CCACCTCATC TGTCCCTGAQ ATGCAGAGCA GGATGGAGGG TCTGCTTCTA 3600 
GCTCAGCTGT TTCTCCTTG A GGTTGCGGAG GAATTGAATT GAATGGGACA GAGGGCAGGT 3660 

70 GCTGTGGCCA AGAAGATCTC CGAGCAGCAG TGACGGGGCA CCTTGCTGTG TGTCCTCTGG 3720 
GCATGTTAAC CCTTCTGTGG GGCCAAAGGT TTGCATCOTG GATCCAGCTG TGCTCCAGTC 3780 
TGTCCCCTCC TCCTCCACTC TG ACTGCCAC GCCCCGGACC AGCAGCTTGG GGACCCTCCA 3840 
GGGTACTAAT GGGGCTCTGT TCTG AGATGG ACAAATTCAG TGTTGGAAAT ACATGTTGTA 3900 
CTATGCACTT CCCATGCTCC TAGGGTTAOG AATAGTTTCA AACATGATTG GCAGACATAA 3960 

75 CAACGGCAAA TACTCGOACT GGGGCATAGG ACTCCAGAGT AGG AAAAAGA CAAAAGATTT 4020 
GGCAGCCTGA CACAGGCAAC CTACCCCTCT CTCTCCAGCC TCTTTATGAA ACTGTTTGTT 4080 
TGOCAGTCCT GCCCTAAGGC AO AAGATGAA TTGAAGATGC TGTGCATGTT TXCTAAGTCC 4140 
TTGAGCAATC ATGGTGGTGA CAATTGCCAC AAGGGATATG AGGCCAGTGC CACCAGAGGG 4200 
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TGGTGCCAAQ TGCCACATCC CTTCCOATCC ATTCCCCTCT GTATCCTOOG AGCACCCCAG 4260 
TTTGCCTTTO ATQTGTOCGC TGTOTATOTT AGCTGAACTT TQATGAGCAA AATTTOCTGA 4320 
CCOAAACACT CCAAAQAOAT AQGAAAACTT GCCGCCT C TI CTrmTOTC CCTTAATCAA 4 380 
ACTCAAATAA GCTTAAAAAA AATCCATGOA AGATCATGGA CATGTOAAAT GAGCATTTTT 4440 
n C i mCIT TTTTTTnTTTTTTTI I A AC AAAGTCTGAA CTGAACAOAA CAAOACTnT 4500 
TOCTCATACA TCTCCAAATT GTTTAAACTT ACTTTATQAQ TGTTTGTTTA GAAGTTCGQA 4560 
CCAACAGAAA AATGCAGTCA GATGTCATCT TGGAATTGGT TTCTAAAAGA GTAAGGCATG 4620 
TCCCTGCCCA GAAACTTAGG AAGCATOAAA TAAATCAAAT GTTTATTTTC CTTCTTATTT 4680 
AAAATCATGC TAATGCAACA GAAATAGAGG GTTTGTGCCA AATGCTATGA ACGGCCCTTT 4740 
CTTAAAGACA AGCAAGGGAG ATTOATATAT GTACAATTTG CTCTCATGTT TTT 



SEQ ID HOa BCU4 Protein semence: 
PratdnAccesstoifc NP.079191.1 

1 It 21 31 41 51 
I I I I I I 

MSQESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENFLTAATKA MMHNGDEDS 60 
AAALGLLYDY YKVFRDKRLL SV5KASOSQE DQEKRNCLGT SEAQSNLSGG ENRVQVLKTV 120 
PVNLSLNQDH LENSKREQYS KFPESSAII PVSG1TWKA EDFTPVFMAP PVHYPRGDGE 180 
EQRWIFEQT QYDVPSLATH SAYLKDDQRS TPDSTYSESF KDAATEKFRS ASVGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGEGPMTYLNKGQFYAITL SETGDNKCFR HPISKVRS W 300 
MWFSEDKNR DEQLKYWKYW HSRQHTAKQR VLDIADYKES FNT1GNIEEI AYNAVSFIWD 360 
VNEEAKIFIT VNCLSTDFSS QKGVKGLPLM [QEDTYSYNN RSNKPIHRAY CQKYPCDKG 420 
AERKIRDEEQ KQNRKNGKGQ ASQTOCNSSS DGKLAAIPLQ KKSDITYFKT MPDLHSQPVL 480 
FIPDVHFANL QRTGQVYYNT DDEREGGSVL VKRMFRPMEE EFGPVPSKQM KEEGTKRVLL 540 
YVRKETDDVF DALMLKSPTV MGLMEAISEK YGLPVEfOAK LYKKSKKGO. VNMDDNHEH 600 
YSNEDTFILN MESMVEGFKV TLMH 



SEQ ID NMBCU70NA SEQUENCE VARIANT 1: 

Nucleic Add Accession I: AA4280S2 

Coring sequence: 1-777 (enCns sequence represenb open reading frame) 

1 11 21 31 41 51 

1 I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTC6C AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAQT TCGGGGCAAA 240 

GTGTTCCCAC OGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTQAGATTT 360 

TTGGGCCAAA ATCT A TCTCT ACGCACTGGA A6ATATC6CT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA QATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCAGCATATA AAGTAGGGGT ACCATQTTCA TCTTGTCCTC CAAGTTATOG GGGATCTTGT 720 
ACTGACAATC TO T BTTWCC AGGAOTTAOG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID HM BCU7 DNA SEQUENCE VARIANT Z 

Nucleic Add Accession*: AA428062 

Cooing sequence 1-777 (enfire sequence represents open reading tame) 



1 11 21 31 41 51 

' 1 1 1 I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC T GTG TTT TC C AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID WftS Braff Pmlrin jmSHS Va^ 1; 
Protein Accession #: none 

l 11 21 31 41 51 

I I I I I I 

KIAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI BAALKAQLDS ADIPKARRKR 60 
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YISQNDHIAI LDYHNQVRGK VFPPAANHEY KVWDEHLAKS AEAWAATCIW DHGPSYLIiRF 120 

LGQNLSVRTG RYRSIIQLVK PWYDEVKDYA PPYPQDOJPR CPHRCPGFHC THYTQHVWAT 180 

SNR1QCAXKA OQNKNVH5SV WRHAVYLVCN YAPKGHWIGE APYKVGVPCS SCPPSYGGSC 240 
TWOCFPGVT SNYLYWFK 

SEQIDtltMBCU7Pn.lclnse<iiienceValanl2: 
Protein Accession f: none 



1 11 21 31 41 51 

I I I I I I 

MIAISAVSSA LLPSLLCEAS TWLLNSTDS SPPTHNFTDI EAALKAQLDS ADIPKARHKR 60 
YISQNDHIAI LDYHNQVRGK VFPPAAKMEY MVWDEMIAKS ABAMAATCIW DHGPSYLLRF 120 
LGQNLSVRTG RYRSILQLVK PWYDBVKUYA FPYPQDCNPR CPHRCFGPMC THYTQHVWAT 180 
SNRIOCAIHT CQJJHNVKGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSVGGSC 240 
TDNLCPPGVT SNYLYMFK 

SEQ 10 N0:7 BCX2 DNA SEQUENCE 

NudeteAcMAccessiont: NM_003014 

Coding sequence: 238-1278 (undeiSned sequences conespond Id slart and stop codas) 

1 U 21 31 41 51 

GGCGGGTTCG CGCCCCG AAG GCTGAO AGCT GGCGCICCTC GTGCCCTGTG TGCCAGACGG 60 
CGGAGCTCCO CCGCCGOACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 
AAACTCIOCT GCGCCCCAGA AOATTTCTTC CTCOGCOAAa OOACAGCOAA AG ATGAGGGT 180 
GGCAGGAAG A G AAGGCGCTT TCTGTCTGCC GGGGTCGCAQ CGOGAGAGGG CAGTGCCAjg 240 
TTCCTCTCCA TCCTAGTGGC GCTCTGCCTG TGGCTGCACC TGGGGCTGGG CGTGCGCGGC 300 
GCGCCCTGCG AGGCGGTOCG CATCCCTATQ TGCCGGCACA TGCCCTOGAA CATCACGCGO 360 
ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TOCTGGCCAT CGAGCAGTAC 420 
GAGGAGCTGG TGGACGTGAA CTGCAGCGCC GTGCTGOGCT TCTTCTTCTG TGCCATGTAC 480 
GCGCCCATTT GCACCCTGG A GTTCCTGCAC GACGCTATCA AGCCGTGCAA GTCGGTGTGC 540 
CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 
AGCCTGGCCT GOOAOGAGCT GCCTOTCTAT GACCGTGGCG TGTGCATTTC GCCTQAAGCC 660 
ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 
CAGGAAAGGC CTCITBATGT TGACTGTAAA CGCCTAAGCC COGATCGGTG CAAGTGTAAA 780 
AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATOT TATTCATGCC 840 
AAAATAAAAG CTGTGCAGAG GAGTGGCIGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 
GAG ATCTTCA AGTCCTCATC ACCCATCCCT CGAACTCAAG TCCCGCTCAT TACAAATTCT 960 
TCTTGCCAGT GTCCACACAT CCIGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 
CGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 
AAGAAAACAG CGGGGCGCAC CAGTCGTAGT AATGCCCCCA AACCAAAGGG AAAGCCTCCT 1200 
GCTCCCAAAC CAGOCAGTOC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 
AACCCG AAAA GAGTQ32AGC TAACTAGTTT CCAAAGCGG A GACTTCCGAC TTOCTTACAG 1320 
GATGAGGCTQ GGCATTGCCT GGGACAGCCT ATGTAAGGCC ATGTGCCCCT TGOCCTAACA 1380 
ACTCACTGCA GTGCTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGGTTCA 1440 
GTTnTCTTT GTAAGCCATC ACAAGCCATA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 
GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA G AGTAACCTQ TGTGCATACT 1560 
CTAOAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGAOC TAA TATGTO C ATTGTAAAAT 1620 
AAATGOCATA TTTCAAACAA AACACGTAAT 11111 1ACAG TATGTTTTAT TACCTTTTGA 1680 
TATCTGTTGT TGCAATGTTA GTGATb"! 1 IT AAAATGTG AT GAAAATATAA TGT1T1 1AAG 1740 
AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 
• 1 1 1 1 1U 1 UAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 
TGTGTTrm TACCAATGAC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 
AATAATAAAG AAAAATAAAT AAAAAGG AGA GGCAOACAAT GTCTGGAT7C CTGTTTTTTG 1980 
GTTACCTGAT TTOCATGATC ATGATGCTTC TKJTCAACAC CCTCTTAAGC AGCAGCAGAA 2040 
ACAGTGAGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTfACTT CCCA GGACA T CCAC CCTGAG 2160 
AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCITCAT 2220 
TTAAATATTT TCTTTGCCTA AATACATGTC AGAGGAGTTA AATATAAATG TACAGAGAGG 2280 
AAAGTTGAGT TCCAOCTCTG AAATGAG AAT TACTTGACAG TTGGG ATACT TTAATCAGAA 2340 
AAAAAGAACT TATTTGCAGC ATCTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2460 
AGGCATTCAA TAAATGCACA ACGCCCAAAGG AAATAAAAT CCTATCTAAT CCTACTCTCC 2520 
ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 
GCACTTATAA AATGATTTGA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2640 
CTGCCTCCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTGC TGTCAAGAAA GCAGAAACCA 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2760 
TATTOGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACCGG AATTC 



SEQ ID NO:8BCX2 Protein senuence: 
' Protein Accession t. KP.003005.1 

1 11 21 31 41 51 

MFLsIlVAIjC LWLHLALGVR GAPCEAVRIP MCRHMFWNTr RMPNHLHHST QENAILAIEQ 60 
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YEELVDVNCS AVLRFFPCAM YAPICILEFL HDPKPCXSV CQRARDDCEP LMKMYNHSWP 120 
ESLACDELPV YDRGVCEPE AIVTDLPEDV KWIDtTPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYSYVIH AKKAVQRSG CNEVTTWDV KEDFKSSSPI PRTQVPLTTN 240 
SSOQCPHILP HQDVUMCYE WRSRMMLLEN CLVEKWRDQLSKRSIQWEER LQEQRRTVQD 300 
KKKTAQRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TOPKRV 

SEQ tO NM C8K1 DNA SEQUENCE 

NudeteAdd Accession fc NM.032391 

Coding sequence: 129^(urcleftinedse<^encesconespondto 

1 11 21 31 41 51 

1 ! I III 

GTOCTTOCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTGGGAGGC TGAAACCTTT 60 

AGGCCQATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GOTCCCCACC TTTGCAGAGA 120 

GAACAGC GAT CTTGTGCGCC CATTTCTCRG ATCAAGGACC GGCCCATCTT AETACCTCCA 180 

AGAQTGCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACTG GGCGAGACCA 240 

GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTCGGOAGG CCGAGGCAGG AAGATTCCTT 300 

GAGCACAGGA GTTCCACACC AGCCTGGGCA ATGTAQCAAG ACGCTGTCTC TATTTATACA 360 
ATAAAATTTT TTTAAAAAAG 6 



SEQ ID H0:10 CBK1 Protein sequence: 
Protein Accession I: NP_115767 

1 ll 21 31 41 51 

I I I I II 

MLCAHF5DQO PAHLTTSKSA FLSNKKTSTL KHLLGETRSD GSACNSGBG GRGRKW 

SEQ ID N0:11 CHA1 DNA SEQUENCE 

NudeteAdd Accession »: NM_020I82 

Coding sequence: 95^ (mulerBned sentences correspond to start m>d stop codons) 

1 11 21 31 41 51 

III III 

TCCTTGGGTT CGGGTGAAAG CGCCTGGGGG TTCGTGGCCA TGATCCCCGA GCTGCTGGAG 60 

AACTGAAGGC GOACAGTCTC CTGCGAAAOC AGGCAATGGC GGAGCTGGAG TTTGTTCAfiA 120 

TCATCATCAT CGTGGTGGTG ATGATGGTGA TGGTGGTGGT GATCAjCGTGC CTGCTGAGCC 180 

ACTACARGCT GTCTGCACGG TCCTTCATCA GCCGGCACAG CCAGGGGCGG AGGAGAGAAG 240 

ATGCCCTGTC CTCAQRAGGA T GC CTGTGGC CCTCGGAGAG CACAGTGTCA GGCARCGGAA 300 

TCCCAGAGCC GCM3GTCTAC GCCCCGCCTC GGCCCACCGA OCGCCTGGCC GTGCCGCCCT 360 

TCGCCCAGCG GGAGCGCTTC CACCGCTTCC AGCCCACCTA TCC0TACCT6 CAGCACGAGA 420 

TCGACCTGGC ACCCACCATC TCGCTGTCAG ACGGGGAGGA GCCCCCACCC TACCAGGGCC 480 

OCTGCACOCT CCAGCTTCGG) GACCCCGACC AGCAGCTGGA ACTGAACCGG GAGTCGGTCC 540 

GCGCACCCCC AAACAGAACC ATCTTCGACA QTGACCTGAT GGAXAOTGCC AGGCTGGGGG 600 

GO0CCTGOCC COCCAGCAGT AACTCGGGCA TCAGCGCCAC GTGCTACGGC AGCGGCGGGC 660 

GCATGGAGGG GOCGCCG C CC ACCTACAGCG AGGTCATCGG CCACTACCCG GGGTCCTCCT 720 

TCCAGCACCA GCAGAGCAGT GGGOCGCOCT CCTTGCTGGA GGGGACCC6G CTCCACCACA 780 

CACACATCGC GCCCCTAGAG AGCGCAGCCA TCTGGAGCAA AGAGAAGGAT AAACAGAAAG . 840 

GACACCCTCT CTAGGGTCCC CAGGGGGGCC GOGCTGGGGC TGCGTAOGT0 AAAAGGCAGA 900 

ACACTCCGCG CTTCTTAGAA GAGGAGTGAG AGGAAGGCGG GGGGCGCAGC AAOGCATCGT 960 

GTGGCCCTCC CCTCCCACCT CCCTGTGTAT AAATATTTAC ATGT6ATGTC TGGTCTGAAT 1020 

GCACftAGCTA AGAGAGCTTG CAAAAAAAAA AAGAAAAAAQ AAAAAAAAAA ACCACGTTTC 1080 

TTTGTTGAGC T GTOTCTTGA AGGCAAAAGA AAAAAAATTT CTACAGTAAA AAAAAAAAAA 1140 
A 

sfq ID Hfrtt CHA1 Protein seouence: 
Protein Accession »: NP.064587 

1 11 21 31 41 51 

| I I I I I 

HAELEFVQII IIWVMHVHV WITCLLSHV KLSARSFISR HSQGRHREQA LSSEGCLWPS 60 
ESTVSGNGIP EPQVYAPPRP TDRLAVPPFA QRERFHRFQP TYPYLQHBID LPPTISLSDG 120 
EEPPPYQGPC TWLRDPEQQ LELNRESVRA PPNRTIFDSD LHDSARLGGP CPPSSKSGIS 180 
ATCYGSGGRH EGPPPTYSEV IGHYPGSSFQ HQQSSGPPSL LEGTRLHHTH IAPLESAAIW 240 
SKEKDKQKGB PL 

SEQ ID N0:13 CJA5 DNA SEQUENCE 

Nucleic Add Accession t: NMJM2445 

Codino sequence: 276-1271 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 SI 

I I I I I I 
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GCACGAGGGA AGAGGGTGAT CCGACCCGGG GAASQTCGCT GGGCAGGGCG AGTTGGGAAA 60 

GCGGCAGCCC COSCCGCOCC CGCAGCCCCT TCTCCTCCTT ICTCOCACGT CCTATCTGCC 120 

TCTCGCTGGA GGCCAGGCCG TGCAGCATCG AAGACAGGAG GAACTGGAGC CTCATTGGCC 180 

GGCG0GGG6C GCCGGOCTCO GGCTTAAATA GOASCTCCGG CCTCTGCCTG GGACCCGACC 240 

OCTGOCGGCC GCGCTCCCGC TGCTCCTGOC GGGTGATGGA AAACCCCAGC CCGGCCGOCG 300 

OOCTGGGCAA GGCCCTCTGC GCTCTCCTCC TGGCCACTCT CGGCGCOGCC GGCCAGCCTC 360 

TTGCGGGAGA GTOCATCTGT TCCGCCAGAG CCCCG O C CA A ATACAGCATC ACCTTCACGG 420 

GCAAGTGGAG CCAGACGGCC TTOCCCA&GC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT G C TOGCCCOC GCGCATAGCT CCGACTACAG CATGTGGAGG AAGAACCAGT 540 

ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GG0CTGGGC6 CTGATGAAGG 600 

AGATCGAGGC GGCGGGOOAG GCGCTGCAGA GCGTGCAOGC GGTGTTTTCO GCGCCCGCCG 660 

TCCCCAGCGG CA00G6GCAG ACGTCGGCGG AGCTGGAGGT GCAGCGCAGG CACTCGCTGG 720 

TCTCttTr T GT GGTGCGCATC GTGCCCAGCC CCGACTGGTT CGTOGGCGTG GACAGCCTGG 780 

ACCTOTGCGJA OGGOGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG 840 

CCGGGACGGA CAGCGGCTTC ACCTTCTCCT CCOJCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGOC CAACTCCTTC TACTACCCGC 960 

GGCTGAASGC CCTGCCTCCC ATCGCCAGGG TGACACTGGT GCGGCTGCGA CAGAGCCCCA 1020 

GGGCCTTCAT CCCTOCC6CC CCAGTCCTGC CCAGCAGGGA CAA7GAGATT GTAGACAGCG 1080 

CCTCAGTTCC AGAAACGCCQ CTGOACTGCG AGGTCTCCCT GTGGTCGTCC TGGGGACTGT 1140 

GCGGAGGCCA CTGTGGGAGG CTCGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 1200 

CCGCCAACAA CGGGAGCCCC TGCCCCGA6C TCGAAGAAGA GGCTGAGTGC GTCCCTGATA 1260 

ACTGCGT CTA A6 ACCAGA0C CCOGCAGCCC CTGGGGCCCC CGGAGCCATG GGGTGTCGGG 1320 

GQCTCCTGTG CAGGCTCATG CTGCAGGCGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGSTG AGGGCGCGCC GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC OCGTCTCCCG 1500 

TCTGCTCTCA GCCTCCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTCC AGCTACTCTA 1560 

AATTATGGTC TCCTTATAAG TTATTGCT3C TCCAGGAGAT TGTCCTTCAT CGTCCAGGGG 1620 

CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

CTCTOCC O aa GGCGCATCCA AGCGGGGGCC ACTTGAGAAG TGAATAAATS GGGCGGTTTC 1740 

GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TGCTCAC 



SEQ ID KO:14 CJA5 Protein sequence: 
Protein Accession t. NP_036S77 

1 11 21 31 41 51 

I I - I I I I 

KENPSPAAAL GKALCALLLA TLGAAGQFLG GESICSARAP AKYSITFTGK WSQTAPPKQY 60 
PLFRPPAQWS SLLGAAHSSD YSHWRKNQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 
BAVFSAPAVF SGTGQTSAEL EVQRRHSLVS FWRIVPSPD WFVGVDSLDL CDGDEWREQA 180 
ALDLYPYDAG TDSGPTFSSP NPATIPQDTV TEITSSSPSH PANSFYYPRL KALPPIARVT 240 
LVRLRQSPRA PIPEAPVLPS RDNEIVDSAS VPETPLDCEV SLWSSWGLCG GHCGRLGTKS 300 
RTRYVRVQPA UNGSPCPELB EEAECVPDNC V 



SEQ ID N0:15 LBtfi DNA SEQUENCE 

Nudelc Add Accession #: NM.002391 

Coding sequence: 26457 (underlined sequences correspond to start and slop radons) 

1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

CGCCCTGCTQ GCGCTCACCT CCGCG6TCGC CAAAAAGAAA QATAAGQTGA AGAAGGGCGG 120 

CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTC06CGAG6 GCACCTGCGG GGCCCAGACC CAGCGCATCC GQT6CAGGGT 240 

GCOCTS C AAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTG6G6 300 

TGCGltTTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

CAATGCTCAQ TGCCAGGAGA CCATCCGOST CACCAAGCOC TGCACCCCCA AGAOCAAAGC 420 

AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGAOG CCAAGCCTGG ATGOCAAGGA 480 

GCCCCTGGTG TCACATGGGO CCTGGCCACG CXXTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCIGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCOCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 

TOAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAAITCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



SEQ ID «0:16 IBH9 Protein seouencg 
Protein Accession i: NPJ02382 

1 11 21 31 41 51 

I I I I I I 

HQHRGFLLLT LLALLALT5A VAKKKDKVKK GGPGSECASW AHGPCTFSSK DCGVOPREGT 60 
CGAQTQRIRC RVPCNWKKEP GADCKYXFEH HGACDGGTGT KVRQGTLKXA RYNAQCQETI 120 
RVTKPCTPKT KAKAKAKKGK GRD 
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SEQ ID NO-.17LEM9DNA SEQUENCE 

RuckteAcWActesslonl: ML005244 

Coding sequence: 1-1617 (underlined sequences correspond to start and stop coaons) 

1 11 21 31 41 51 

ATGGTAGAAC TAGTGATCTC ACCCAGCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG SO 

AAGTTTAACC GTGCTGACGC TGCTGTGTGG ACTCTGAGTG ACAQACAAGQ CATCACCAAA 120 

TCGGCCCCCC TGA6AGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAGCCATGGC AGCCTACGGC CAGACGCAGT ACAGTGCGGG GATCCAGCAG 240 

GCTACCCCCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAGA GTGGATTCCT CAGCTATGGC 3S0 

TCCAGCTTCA GCACCTCACC CACTGOACAG AGCCCATACA CCTACCAGAT GCACQGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACOCAG C06GTTT0G6 GAGTGTGCAC 480 

CAGGACTATC CTTCCT A CCC CGGCTTCCCC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCCCTACGT CCCGGCCAGC AGCATCTGCC CTTCGCCCCT CTOC ACGT CC 600 

ACCTACGTCC TCCAGGAGGC AICTCACAAC CTOCCCAACC AGAGTTOCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAOCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACOGGG CCTCOGACGG GAAGCTOCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

CCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCGTGTGGG ACTTGGATCA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACATTT GCATCCAGAT ACGGGAAGGA CAOCACGACG 900 

TCCGTGCGCA TTGGOCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAAT3 ACCTGGAGGA TTGTGACCAG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATGGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTQGG CTCTGGCGTG CACGGCGGCG TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAOATG TACAATACCT ACAAGAACAA CGTTGQTGGG 1200 

TTGAIAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

ACAGACCTCT GGCIGACCCA CTCCCTGAAG GCACTAAACC TCATCAACTC C CGGCCC AAC 1320 

TGTGTCAATG TGCTGGTCAC CACCACTCAA CTAATTCCTG CCCTGGCCAA ACTCCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGCGGATA 1560 
TOCIGCCACG CAGACCTGGA GGCACTG AGG CAOGCCCTGG AACTGGAGTA TTT ATAG 

SEQ D Nftia LEM9 Protein sequence: 
Protein Accession f. NP.005235 



1 11 21 31 41 51 

III III 

H7ELVISPSL TVNSDCLDKL KFNRADAAVW TLSDRQGITK SAPLRVSQLF SRSCPKVLPR 60 

QPSTAMAAVG QTQYSAGIQQ ATPYTAYPPP AQAYGIPSYS 1KTHDSUJHS PGQSGFLSYG 120 

SSFSTSPTGQ SPYTYQMHGT TGFYQGGNGL GNAAGPGSVH QDYPSYPGFP QSQYPQYYGS 180 

?SYNPPYVPAS 'SICPSPLSTS TWI>QEASHN 'VPNQSSESIA GEYNTHNGPS TPAKEGDTDR 240 

PHRASDGKLR GRSKRSSDPS PAGHJEIERV FVWDLDETH IFHSLLTGTF ASRYGKDTTT 300 

SVRIGU4HEE HIFNLADTHL FFNDI4KDCDQ IHVDDVSSDD NGQDLSTYNF SADGFHSSAP 360 

GAMLCLGSGV HGGVDWMRKL AFRYRKVKEM YNTYKNNVGG LIGTPKRETW LQLBAELEAL 420 

TDLWLTHSLK ALNLINSRPH CVNVtVTTTQ LIPALAKVLL YGLGSVFPIE UIYSATKTGK 480 
BSCPERIKQR PGRKAWWI GDGVSBEQGA KKKNHPFWRI SCHADLBALR HALELEYL 

SEQBNftWOAAl ONA SEQUENCE 

Nucleic Add Accession f. NM_002740 

Coding sequence: 178-1958 (underlined sequences correspond to start and stop codorts) 

1 11 21 31 41 51 

||| III 

CCGCGGTTCC GGCTGCTCCG GOGAGGCGAC CCTTGGGTCG GCGCTGOGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGCGGAG TCW3CCACGG 120 

CGCCCGAAGC GCCCCCCGCA CCCCCGGCCT CCAGCGTTGA GGCGGGGGAG TGAGGAGATG 180 

CCGACCCAGA GGGACA&CAG CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGGGGAC 240 

CATTCCCACC AGGTCOGGGT GAAAGCCTAC TACCGCGGGG ATATCATGAT AA CACATT TT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TCTTCACCAT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAACA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 480 

TTGATTCATG TGTTCCCTTG TCTACCAGAA CGTCCTGGGA TGCCTTGTCC AGGAGAAGAT 540 

AAATCCATCT ACCGTAGAGG TGCACGCCGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

ACTTTCCAAG CCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TCOCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TCCCACAGGA ACCAGTQATG 780 

CCCATGGATC AGTCATCCAT GCATTCTGAC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATC AGAGTTTGQA TCAAGTTCGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCIT CAGGATTTTG ATTTGCTCCG GGTAATAGGA 960 

AGAGGAAOTT ATGCCAAAGT ACTGTTGGTT CGATTAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA G CTT C TT AAT GATGATGAGG ATATTGATTG GGTACAQACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATCCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT GTTCTTTGTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTCCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 
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ATCAGTCTAG CATTAAATTA TCTTCATGAG CGAGGGATAA TTTATAGAGA TTTGAAACTG 1320 

GACAATGTAT TACTGQACTC TGAAGGCCAC RTTAAACTCA CTGACTACGG CATGTGTAAG 1380 

GAAGGATTAC GGCCAGGAGA TACAACCAGC ACTTTCTGTG GTACTCCTAA TTACATTGCT 1440 

CCTGAAATTT TAAGAGGAGA AGATTATGGT TTCAGTGTTG ACTGGT6GGC TCTTGGAGTG 1S00 

CTCATGTTTG AGATGATGGC AGGAAGQTCT CCATTTGATA TTGTTGGGAG CTCCGATAAC 1560 

CCTGACCAGA ACACAGAGGA TTATCTCTTC CAAGTTATTT TGGAAAAACA AATTCGCATA 1620 

CCACGTTCTC TCTCTGTAAA AGCTGCAAGT OTTCTGAAGA GTTTTCTTAA TAAGGACCCT 1680 

AAGGAACGAT TGGGTTGTCA TCCTCAAACA GGAT TT GC T G AXATTCAGGG ACACCCCTTC 1740 

TTCCGAAATO TTGATTCGGA TATGATGGAG CAAAAACAGG TGGTACCTCC CTTTAAACCA 1800 

AATATTTCTQ GGGAATTTGG TTTGGACAAC TTTGATTCTC AGTTTACTAA TGAACCTGTC 1860 

CAGCTCACTC CAGATGACGA TGACATTGTG AGGAAGATTG ATCAGTCTGA ATTTGAAGGT 1920 

TTTGAGTATA TCAATCCTCT TTTQATOTCT GCAGAAOAAT GTGTCTGATC CTCATTTTTC 1980 

AACCATGTAT TCTACTCATG TTGCCATTTA ATGCATGOAT AAACTTGCTG CAAGCCTGGA 2040 

TACAATTAAC CATTTTATAT TTGCCACCTA CAAAAAAACA CCCAATATCT TCTCTTGTAG 2100 

ACTATATGAA TCAATTATTA CATCTGTTTT ACTATGAAAA AAAAATTAAT ACTACTAGCT 2160 

TCCAGACAAT CATGTCAAAA TTTAGTTGAA CTGGTTTTTC AGTTTTTAAA AGGCCTACAG 2220 
ATGAGTAATG AAGTTACCTT TTTTGTTTAA AAAAAAAAAA O 



M=n m MftM QAA1 PnHdn seeuaicg 
Prateh Accession*: NPJJ02731 

1 11 21 31 41 51 

I I I I I I 

MSHTVAGGGS GDHSHQVRVK AYYRGDIMIT HPEPSISFEG LCHEVFDMCS FDMEQLFTMK 60 

WIDEGGDFCT VSSQLELEEA FHLYBLNKDS ELLIHVFPCV PERPGMPCK3 EDKSIYTOGA 120 

RRWRKLYCAN GHTFQAKRFH RRAECAICTD RIWGLGRQGY KCIHCKLLVH KKCHKLVTIE 180 

CGRHSLPQEP VMPMDQSSKH SDHAQTVIW NPSSHESLDQ VGEEKEAMNT RBSGKASSSL 240 

GLQDFDLLRV IGRGSYAKVL LVRLKKTDRI VAMKWKKEL VNDDEDIDW QTBKHVFEQA 300 

SKHPPLVGLH SCFQTESRLF FVIEYVNGGD LMFHHQRQRK LPEEHARFXS AEISLALNYL 360 

HERGIIYKDL KLDNVLLDSE GHIRLTDYGM CKEGLRPGDT TSTFCGTENY IAPEILRGED 420 

YGPSVDWWAL GVLMFEMMAG BSPPDIVGSS DNPDQNTEDY LFQVILEKQI RIPRSLSVKA 480 

ASVLKSFLNK DPKERLGCHP QTGFADIQGH PFFRHVDWDM MEQKQWPPF KPKISGEPGL 540 
DNFDSQFTNB PVQLTPDDDD IVRKTDQSEP EGFEYINPLL HSAEECV 

SE0IDH0210BH2 DMA SEQUENCE 

Nucleic Add Accession t. 105628 

Cooing sequence 1 97-4792 (underlined sequences corcespond to start and stop radons) 

1 11 21 31 41 51 

I I I I 1 I 

CCAGGCGGCG TTGCGGCCCC GGCCCCGGCT CCCTGCGCCG CCGCOGCCGC CGCCGCCGCC 60 

GCCGCCGCCG CCGCCGCCAG CGCTRGCGCC AGCAGCCGGG CCCGATCACC CGCOGCOCOG 120 

TGCCCGCCGC CGCCCGCGCC AGCAACCGG8 CCCGATCACC CGCCGCCCGG TGCCCGCCGC 180 

CGCCCGCGCC ACCGGCATGQ CGCTCCGGGG CTTCTGCAGC GCCGATGGCT CCGACCCGCT 240 

CTGGGACTGG AATGTCACQT GGAATACCAQ CAACCCCGAC TTCACCAAGT GCTTTCAGAA 300 

CACGGTCCTC GTGTGGGTGC CTTGTTTTTA CCTCTGGGCC TGTTTCCCCT TCTACTCCCT 360 

CTATCTCTCC CGACATGACC GAGGCTACAT TCAGATGACA CCTCTCAACA AAACCAAAAC 420 

TGCCTTGGGA TTTTTGCTGT GGATCGTCTG CTGGGCAGAC CTCTTCTACT CTTTCTGGGA 480 

AAGAAGTCGG GGCATATTCC TGGCCCCAGT GTTTCTGGTC AGCCCAACTC TCTTGGGCAT 540 

CACCACGCTG CTIGCTACCT TTTTAATTCA GCTGGAGAGG AGGAAGGGAG TTCAGTCTTC 600 

AGGGATCATG CTCACTTTCT GGCTGGTAGC CCTAGTGTGT GCCCTAGCCA TCCTQAGATC 660 

CAAAATTATG ACAGCCTTAA AAGAGGATGC CCAGGTGGAC CTGTTTCGTG ACATCACTTT 720 

CTACGTCTAC TTTTCCCTCT TACICATTCA GCTCGTCTTO TCCTGTTTCT CAGATCGCTC 780 

ACCCCTGTTC TCGGAAACCA TCCACGACCC TAATCCCTGC OCAGAGTCCA GCGCTTCCTT 840 

CCTGTCGAGG ATCACCTTCT GGTGGATCAC AGGGTTGATT GTCCGGGGCT ACCGCCAGCC 900 

CCTGGAGGGC AGTGACCTCT GGTCCTTAAA CAAGGAGGAC ACGTCGGAAC AAGTCGTGCC 960 

TGTTTTGGTA AAGAACTGGA AGAAGGAATG CGCCAAGACT AGGAAGCAGC CGGTGAAGGT 1020 

TGTGXACTCC TCCAAGGATC CTGCCCAGCC GAAAGAGAGT TCCAAGGTGG ATGCGAATQA 1080 

GGAGGTGGAG GCTTTGATCG TCAAGTCCCC ACAGAAGGAG TGGAACCCCT CTCTGTTTAA 1140 

GGTGTTATAC AAGACCTTTG GGCCCTACTP CCTCATGAGC TTCTTCTTCA AGGCCATCCA 1200 

CGACCTGATG ATGTTTTCCG GGCCGCAGAT CTTAAAGTTG CTCATCAAGT TCGTGAATGA 1260 

CACGAAGGCC CCAGACTGGC AGGGCTACTT CTACACCGTG CTGCTGTTTG TCACTGCCTG 1320 

CCTGCAGACC CTCCTGCTGC ACCAGTACTT CCACATCTGC TTCOTCAGTG GCATGAGGAT 1380 

CAAGAOCGCT GTCATTGGGG CTGTCTATCG GAAGGCCCTG GTGATCACCA ATTCAGCCAG 1440 

AAAATCCTCC ACGGTCGGGG AGATTGTCAA CCTCATGTCT GTGGACGCTC AGAGGTTCAT 1500 

GGACTTGGCC ACGTACATTA ACATGATCTG GTCAGCCCCC CTGCAAGTCA TCCTTGCTCT 1560 

CTACCTCCTG TGGCTGAATC TGGGCCCTTC CGTCCTGGCT GGAGTGGCGG TGATGGTCCT 1620 

CATGGTGCCC GTCAATGCTG TGATGGCGAT GAAGACCAAG ACGTATCAGG TGGCCCACAT 1680 

GAAGAGCAAA GACAATCGGA TCAAGCTGAT GAACGAAATT CTCAATGGGA TCAAAGTGCT 1740 

AAAGCTTTAT GCCTGGGAGC TCGCATTCAA GGACAAGGTG CTGGCCATCA GGCAGGAGGA 1800 

GCTGAAGGTG CTGAAGAAGT CTGCCTACCT GTCAGCCGTG GGCACCTTCA CCTGGGTCTG 1860 

CACGCCCTTT CTGGTGGCCT TGTGCACATT TGCCGTCTAC GTGACCATTG ACGAGAACAA 1920 

CATCCTGGAT GCCCAGACAG CCTTCGTGTC TTTGGCCTTG TTCAACATCC TCCGGTTTCC 1980 

CCTGAACATT CTCCCCATGG TCATCAGCAG CATCGTGCAG GCGAGTGTCT CCCTCAAACG 2040 

CCTGAGGATC TTTCTCTCCC ATGAGGAGCT GGAACCTGAC AGCATCGAGC GACGGCCTGT 2100 

CAAAGACGGC GGGGGCACGA ACAGCATCAC CGTGAGGAAT GCCACATTCA CCTGGGOCAG 2160 

GAGCGACCCT CCCACACTCA ATGGCAXCAC CTTCTCCATC CCCGAAGGTG CTTTGGTGGC 2220 
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CGTGGTGGCC CAGGTGGGCT GCGGAAAGTC tMCCCTGCTC TCAGCCCTCT TGGCTGAGAT 2280 

GQACAAAGTG GAGGGGCACG TGGCTATCAA GGGCTOCBTG GCCTATGTGC CACAGCAGGC 2340 

CTGOATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT GQATQTCAGC TGOAGGAACC 2400 

AXA9TACAGG TCCGTGATAC AGGCCTOTGC CCTCCTCCCA GACCTGGAAA TCCTGCOCAG 2460 

S TGGGGATCGO ACAGAGATTG GCGAGAAGGG CGTGAACCTG TCTGGGGGCC ASAAGCAGOG 2520 

CGTGAGCCTG GCCCGGGCCG TGT A CTCCAA CGCTGACATT TACCTCTTCG ATGATCCCCT 2580 

CTCAGCAGTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTQ GCCCCAAGGG 2640 

QATGCIGAAG AACAAGACGC GGATCTTGGT CAOGCACASC ATGAGCTACT TGCCGCAGGT 2700 

„ GGAOGTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAO ATGGQCTCCT ACCAGGAGCT 2760 

10 GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCS TA CC TATGCCAGCA CAGAGCAGGA 2820 

GCAGGATGCA GAGGAGAACG GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGG TGACGGACAG TGCAGSGAAG CAACTOCAGA GACAGCTCAG 2940 

CAGCTCCTCC TOCTATAGTG GGGACAICAG CAGGCACCAC AACAGCACCG CAGAACTGCA 3000 

CAAAGCTGAG GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAQAC 3060 

15 AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATOGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTT C C TT T TCATGTGTAA CCATGTGTCC GCGCTGQCTT CCAACTATTQ 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAACGGGACT CAGGAGCACA CGAA AGTCC G 3240 

GCTGAGCGTC TATGOAGCCC TCGGCATTTC ACAAGGGATC GCCGTGTTTG GCTACTCCAT 3300 

CGCCGTOTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTO CACGTGGACC TGCTGCACAG 3360 

20 CATCCTGCGG TCAOCCATGA G CTTCTTTG A GCGGACCCCC AGTGGGAACC TS GTGAA OCG 3420 

CT T C T OCAAO GAGCTGGACA CAGTCGACTC CATGATCCCQ GASGTCATCA AGATGTTCAT 3480 

GGGCTCOCTQ TTCAACGTCA TTGGTGCCTQ CATCGTTATC CTGCTGGCCA CGOCCATCGC 3540 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

TTCCTCOCGQ CAGCTGAAGC GCCTCGAGTC GGTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

25 t CAACGAGACC TTGCTGGGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AQOGCTTCAT 3720 

CCAOCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCOCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC IGTTTGCTGC 3840 

CCTQTTTCCQ GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGOC TCTCAGTGTC 3900 

TTACTCATTG CAGGTCACCA OGT A CTTGAA CTGGCTGGTT CGGATGTCAT CTGAAAXGGA 3960 

30 AACCAACATC.GTGGCCGTGG AGAGGCTCAA GGAGTAITCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGA6ACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

CCTGACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

35 CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGG GTT C OCTCCG AATGAACCTG GACCCATTCA GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGOCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGCGGGGAG AACCTCAGTG TCGGGCAGCG 4500 

CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAfiATCCTTG TGTTGGATGA 4S60 

40 GGCCACGGCA GCCGTGGACC TGGAAACGGA CGACCTCATC CAGTCCACCA TCCGQACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCQGCTC AACACCAICA TGGACTACAC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTCTTTTCT ACAGC&TGGC CAAAGACGCC GGCTTGGTGT GAj GCCCCAGA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAGGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

45 OCCCTGGTAA ACCAAGOCTC -CCACACTGAA ACCAAAACAT AAAAACCAAA COCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGOCACCGCC ATCCGGTCCC .CTGCCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGOGAACCAC C 



50 SEQ ID HO-.22 OBH2 Protein sequence: 
Protein Accession*: AAB4661B 

1 11 21 31 41 51 

55 MALRGFCSAD GSDPLWDHNV TWHTSNPDFT KCFQNTVLVW VPCPYLWACF EFYFLYLSBH 60 

DHGYIOMTHi NKTKTALGFL LWIVCWADLF YSFWERSRGI FIAPVFLVSP TL LGIT TUA 120 

TPLIQLERRK GVQSSGIHLT FWLVALVCAL AILRSKIMTA ZjKEDAQVDLF RDITFYVYFS 180 

LLLIQLVLSC PSCRSPLPSB TIHDPNPCPE SSASFLSRIT FWWITGLIVR GYRQPLEG3D 240 

LWSLNKEDTS EQWPVLVKN WKKBCAKTRK QPVKWYSSK DFAQFKESSK VDANEEVEAL 300 

60 IVKSPQKEWN PSLPKVLYKT FGPYFLMSFF FKAIHDLBHF SGPQILKLLI KPVNDTKAPD 360 

WQGYPYTVLI* FVFACLQTLV LHQYFHICFV SGMRIKTAVI GAVYRKALVI TNSABKSSTV 420 

GEIVNUEVD AQSFHDLATY IKKIWSAPLQ VILALYLLWL HIjGPSVLAGV AVKVLHVFVN 480 

AVMAHKTKOT QVAHKKSRDN RIKLHNEILN GIKVLKLYAW ELAFKDKVLA IRQEELKVLK 540 

KSAXLSAVGT FTWVCTPFLV ALCTFAVYVT IDENKILDAQ TAFVSLALPK ILRPPLNILP 600 

65 MVISSIVQAS VSLKRLRIPL SHEELEPDSI ERRPVKDGGG TOSITVHNAT FTMARSDPPT 660 

tHGITPSIPE GALVAWGQV GCGKSSLLSA LLREHDKVEG HVAIKGSVAY VPQQAWIQND 720 

SLRENIUGC QLEEPYYRSV IQACALLPDI/ EILPSGDRTE IGEKGVNLSQ GQKQRVSLAR 780 

AVYSNADIYL FDDPLSAVDA HVGKHIFENV IGPKGMLKNK TRILVTHEKS YLPQVDVIIV 840 

HSGGKISEKG SYQELLARDG APAEFLRTYA STBQEQDAEB NGVTGVSGPG KBAKQMENGtt 900 

70 LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TAELQKAEAK KEBTWKLHEA DKAQTGQVKL 960 

SVYV7DYHKAI GLFISPLSIP LFMCNHVSAL ASNYWLSLMT DDPIVMGTQ8 HTKVRLSVYG 1020 

ALGISOG1AV FGYSHAVSIG GIIASRCLH7 DLLHSILRSP MSFPERTPSG HLVNRPSKEL 1080 

DTVDSMIPEV IKMFHGSLFN VIGACIVILL ATPIAA1IIP PLGLIYFFVQ RFYVASSRQL 1140 

KBLESVSRSP VYSHFNETLL GVSVIRAFEE QERFIHQSDL KVDEHQKAYY PSIVANRWIA 1200 

75 VELECVGNCI VLFAALFAVI SRHSLSAGLV GLSVSYSLQV TTYLNMLVRM SSEMETHIVA 1260 

VERIJCEYSET EKEAPKQIQE TAPPSSWPOV GBVEFBHYCL RYREDLDFVL RHIHVTIDGG 1320 

EKVGIVGRTG AGKSSLTLGL FHINESAEGE IIIDGIHIAK IGLHDLRFKI T1IPQDPVLF 1380 

SGSLRHNLDP FSQYSDEEVW TSLELARLKD FVSALPDKLD BECAEGGEHL SVGQRQLVCL 1440 

ASALUtKTKI tVLDEATAAV DLETDDLIQS TIRTQFEOCT VLTIAHRLNT 1MDVTK7IVL 1500 

80 DXGEIQBYGA PSSLLQQRGI. FYSHAKDAGL V 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
SO 
55 
60 
65 
70 
75 
80 



SEQ ID N023PAA2DNA SEQUENCE 

Nuto Add Accession I: NMJD13309 

Codhn. sequence: 1-1290 (inde*ied sequent amegxyid to start and slop codrnis) 



ATGGCCGGCT 
CTGTTTTTAA 
TCTOGGTTCA 
CCTGTTAACG 
TTACCTTTGA 
CAGAGAGAGA 
TACTTGCTTT 
ATGACAGATO 
TTGTGGCTAT 
GTTTTGTCAG 
CAAGCTGTGC 
ACCGCAGCTG 
CACCGTCACT 
GAACGTAACC 
GATITGGTAC 
TACAAQAXTG 
TTTCGAATCA 
GTAJ3ACTATA 
AATATCTGGT 
GGAAGTTCAT 
TTTGGCATGT 
TGTGCAAATT 



11 
I 

CTGGCGCGTG 
ATGACACCAG 
ACAAACTTGG 
GGGCGCACCC 
CCAACAGTCA 
TACTGAAGCA 
TCATGMTGG 
CACTTCATAT 
CATCAAAAIC 
CTATGATTAG 
AAAGAACTAT 
3TGGAGTTGC 
CGCATTCOCA 
AXGGGCAGGA 
AGAGTGTTGG 
CTGATCCCAT 
TATGGGATAC 
TCAAAGAAGC 
CTCTCACTTC 
CTAAATGGGA 
ATAGATGTAC 
GTCAGAGTTC 



21 
I 

GAAGCGCCTC 
OQCCTTTGAC 
AGTTGTGGTQ 
GACCCTCCAG 
GCTQAGTTTQ 
GAGAAAGGTQ 
AGAACTTGTA 
GTTAACT6AC 
ACCAACCAAA 
TGTGCTGTTG 
CCATATGAAC 
AGTTAA9GTA 
CTOCCTGCCT 
TAGCCTGGCA 
TGTGCTAA1A 
CTGTACATAC 
AGTAGTTATA 
CTTGATGAAA 
AGGAAAATCT 
GGAAGTACA6 
TATTCAGCTT 
TAGTCCCTGA 



31 
I 

AAATCTATGC 
TTCTCGGATG 
GCCGATGACG 
GCCGACGATG 
AAGGTGQACT 
AAAGCCAGGT 
GGTGGATACA 
CXAAGOGCCA 
AGATTCACCT 
GTGTATATAC 
TATGAAATAA 
ATAATGGGGT 
TCAAATTCCC 
GTGAGAGCTG 
GCTGCATACA 
GTATTTTCAT 
ATACTAGAAG 
ATAGAAGATG 
ACTGCCATAQ 
TCCAAAGCAA 
CAGAJ3TTACA 



41 
I 

TAAGGAAGGA 
AGGCGGGGGA 
GTTCCGAAGC 
ATTCCTTACT 
CCTGTQACAA 
TGACCATTGC 
TTGCAAATAG 
TCATACTCAC 
TTGGATTTCA 
TTATGGGATT 
ATGGAGATAT 
TTCTGTT6AA 
CTACCAGAQG 
CATTTGTACA 
TCATACGATT 
TACTTGTGGC 
GTGTGCCAAG 
TATATTCAGT 
TTCACATACA 
ACCATTTATT 
GGCAAGAAGT 



51 
I 

TCATGCGCCG 
CGAGGGGCTT 
CCCGQAAAGG 
GGACCAAGAC 
CTGCAGCAAA 
TGCCGTTCTG 
CCTAGCAATC 
CCTGCTTGCT 
TCGCTTAGAG 
CCTCTTATAT 
AATGCTCATC 
CCAGTCTOGT 
TTCTGG G T CT 
TGCTTTGGGA 
CAAGCCAQAA 
TTTTACAACA 
CCATTTGAAT 
CGAAGATTTA 
GCTAATTCCT 
ATTGAACACA 
GGACAGAACT 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



SEQ ID H0-.24 PAA2 Protein sequence: 
Protein Accession!: NP.037441 



MAGSGAWKRL 
PVNGAHPTLQ 
YLLPHIGELV 
VLSAXISVLL 
HSHSHSHSLP 
YKIADPICTY 
MIWSLTSGKS 
CANCQSSSP 



11 

I 

KSHTiBKTlPAP 
ADDDSLLDQD 
GGYIANSLAI 
VYILMGFLLY 
SNSPTRQSGC 
VFSLLVAFTT 
TAIVHIQUP 



21 
I 

LFLNDTSAFP 
LPLTNSQLSL 
KTBALHKLTD 
EAVQRTIHHN 
ERNHGQDSLA 
FRIIWDTWI 
GSSSKWSEVQ 



31 
I 

FSDEAGDEGL 
KVDSCDNCSK 
LSAIILTLLA 
YBINGDIHLI 
VRAAFVHALG 
ILEGVPSHLN 
SKANHLLLNT 



41 
I 

SRFNKLKVW 
QREILKQRKV 
LWLSSKSPTK 
TAAVGVAVNV 
ELVQSVGVLI 
VDYIKEALKK 
FGHYKCTIQL 



51 
I 

ADDGSEAPEB 60 

KARLTIAAVL 120 

RFTFGFHRLE 180 

mGFLLKQSG 240 

AAYIIRFKPE 300 

IEDVYSVEDI. 360 

QSYRQEVDRT 420 



SEQ ID K025PAA3 UNA SEQUENCE 

Nudelc Add Accession*: AB037765 

Coring sequence: 375-2798 (undetEned sequences carespond to start and stop codons) 



1 

I 

GCCGAGTCGG 
AAGTGGTTCC 

AGGTCOCGGG 
ATTTGAAAGT 
AGTGTTGTCT 
AACT6CAGCT 
TAATGTGCAT 
ATTTTAGTAC 
ACTATGGAAT 
GAAAAGAAAA 
TCCCTACTGA 
TTTTTAGTGA 
TGAAAGGAAA 
GAGCAGTCAT 
AAATTGCCCT 
TTCATTGTAA 
CATTQACTAC 
AAGTTGCTGA 
TTTTTATTGT 
CTTGGCGTCT 
ACATTCCTCA 
TTTTCGTATT 
AGGAAATACA 
ATGAAGTGGC 



11 
I 

TGGCGGCTGC 
AGGCTAOCCG 
OGGGAACTGT 
CAGATAACAT 
AGCAAAATAG 
TAGGAAACAG 
GATAATCTTT 
TTTTTACATG 
ATTGCAACCA 
1TCAGTTGCC 
GGATTTGATG 
CAOCTTGTTT 
AGTGAAATAT 
AGCAAATATT 
GGAAGCCGGT 
TTTGGAAAGT 
ACTAGTCTTG 
ACTGAACATT 
AGATCCTCAA 
TAGCCAACAG 
TCTGGGAAAA 
AGATGCTAAT 
ACATGATGTT 
AGAAGATGAA 
AGAAACTGTT 



21 
I 

AGGCTGGGAG 
GCTAGTCTGG 
TGGCCGCGCG 
AGATCAICAG 
AAAATAAAGA 
AACACAGCAG 
TCCGGCTTCA 
CCAACAGTAA 
GGTCTTGAAG 
AAGGTTAATT 
AAAGCATATT 
GATGTGAATQ 
ATTACCAACC 
AIATTCTCAT 
TTTGTGTATG 
ATTGGCTCTG 
GACTTGAOCC 
CACCTGTTTA 
CAAGTTTCAA 
GCTACTTATG 
GCAGGAGTTC 
GTGGTCTTCA 
GATTTAATAA 
GACAA1GACA 
TTCAGAGATA 



31 
I 

GGAGAAGTGC 
CACGGCCCCG 
GCCTCGGGAA 
TAGAAAACTT 
ATTAACAGCA 
TGAAAAAACA 
ATGTCTTTAG 
ACTCTTTACC 
AACTGAATGA 
GTGTCAAAGA 
TATTCAAGGG 
CCATTGTCGC 
TGGAAGACCT 
ATGTAAQAGC 
GGACTACASA 
AGGATGTGGA 
AGCAATGTAG 
TTAACACAAT 
CTGTCCATCT 
AAGCTGATAG 
TACTCTTGTT 
AAAGAGCAGA 
TATCTCATGT 
TGGAAGGTCC 
GGAAGAGAAA 



41 
I 

TACGGCTTTG 
TCTTCTGCCT 
CGGCCCAGGT 
CTTGAAGTTG 
CATACAGAGG 
GACAAAATCC 
AGTTGGGATC 
AGAACTGAGT 
GGCTGTTAGA 
AGAAATATCA 
CAACATATTG 
OCATGTTCTC 
TCAGAACATA 
CATTGGAATA 
CCAATTTGTC 
ATAIGCACAT 
AAGAACACTA 
GAAAGCACCT 
CCAACTSGGC 
AAGAACTGCA 
AAGGGACTCT 
AGAGGGAGTT 
GGAAAATAAT 
AGATATAGAT 
ATTACCTTTG 



51 
I 

CAGGTTGGCG 
CCTCCTCCGT 
OCCCGCOCGC 
TTCAAGAAAA 
ACAGCATGQA 
GCTCAGATAC 
TCTTTTGTCA 
CCTCAGAAAT 
CCTCTGCAGG 
AGATACTGTG 
CTCAGAGAAT 
TTTGCTCTTC 
GAAAATGCTC 
CCAGAGCACA 
TTAACCACAG 
CTCTACTTTT 
ATGGAACAGC 
CTGTTGACTG 
TTACCACTGG 
GAATGGGTTO 
TTGGAAGTGA 
CCAG1GGAAT 
ATGCACATTG 
GTTCAGGATG 
GAACTTACAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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TGGAACTAAC AGAASAAACA TTTAATGCAA CAGTGATGGC TTCTGACAGC ATAGTACTCT ISfiO 

TCTAtGCTGG TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAG 1620 

TTAAACTGAA AGGCACATCT ACTATGCTTC TTACTAGAAT AAACTGTGCA GATTGGTCTG 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTCCTATCAT AAAGATGTAC AAGAAAGGCG 1740 

AGAAOOCAGT ATCTEATGCT GGAATGTTAG GAACCAAAOA TCTCCTAAAA TTTATCCAGC 1800 

TCAACAGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA 6AATATTTAA 1860 

GTGGGGAATT ATATAAAGAC CTCATCTTGT ATTCTAGTCT GTCAGTATTG GGACTATTTA 1920 

GTCCAAOCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AOGAAACTAC CTAAAAGGAT 1980 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCT T CC AGCCCTGCT G CTTGCCAGAC ACACAGAAGG CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA CCATEATTGA 2220 

TTTTGTTCAG TGATGGCACT QTAAATCCTC AATATAAAAA AGCAATATTG ACACTGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC CCTTCCTCTT CTTGTTTTGQ 2400 

TGAATCTCCA TTCASGTGGC CAAOTATTTO CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATIAG AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA CCTCCTCTTC CAQCTTATOA TTTTCTAAGT ATQATAGATG 2580 

CCGCAACATC TCAACGTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGAXA AATCGGCAGT CAGAAAAGAA CGGATTGAAA 2700 

CTCTGAGAAT AAAGCATT6G AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AA ATCATT TA 2760 

GACGTGATAA AGAGTTAGGA TGCTCAAAAG TGAACTAATT TTATAGGGCT GTGGTT TCCA 2820 

AAA TTTTTTT GGCATGATAG ACTXAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGAXATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CGAAATAT3U3 AAATTATTAA TGAGATAJTT 3060 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTAXACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTG TGACTACTGT 3180 
ATTGGACAGT TCAGTACTAG ACAAAAACTA GCATAATTAA CTTAGTTCIA GCCATSATTT l - 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTGCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGCGTTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CIAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 3480 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CTGGTTCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGATTATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 3720 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3780 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAO GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTGTCT CATTTTTGCT 4020 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTT GOTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT CTACTTGATA TCTTGTTCTT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATCTAACAG CTAAAACTCA GTAAGTGCAT 4200 

MTATTTCCTT ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTCCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TTAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATGAGAGT GAACTCTAAA TATAGGTTGT AGTAATAAAA CATCATCAGC CTAATTATTA 4380 

GAAAASGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTSTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAAXACCTT CAAAAAAAAA AAAAAA 

SEO n> K026 PAA3 Protein secnience: 
Protein Accession*: BAAS2582 

1 11 21 31 41 51 

1 I I I I I 

MFSGFtJVPHV GISFVJMCIF YMPTVNSLPB LSPQKYFSTL QPGLEBLNEA VRPLQDYGIS 60 

VAKVKCVKEE ISRYCGKEKD LMKAYLFKGN ILLREPPTDT LFDVNATVAH VLPALLPSHV 120 

KY1TKLEDLQ NIENALKGKA HIIPSYVHAI GIPEHRAVME AGFVYGTTYQ FVLTTEIALL 180 

ESIGSEDVEY AHLYFFHCKL VLDLTQQCRR TLHEQPLTTL NTHLPIKTMK APLLTEVAED 240 

PQQVSTVHLQ -LGLPLVPIVS QOATYEADBB TAEWVASfRLI* GKAGVLLLLR DSLEVNIPQD 300 

ANWFKRAEE GVPVEFLVLH DVDLIISHVE NKMHIEEIQE DEDNDHEGPD TDVQDDEVAE 360 

TVFRDRKRKL PLELTVELTE ETFNATVMAS DSIVLPYAGW QAVSHAFLQS YIUVAVKLKG 420 

TSTOLLTRIN CADWSDVCTK QNVTEFPIIK MYKKGEHPVS YAGHLGTKDL LKFIQLHRIS 480 

YPVNITSIQE AEEYLSGBLY KDtOLYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGWITG 540 

IYSEEDVLLL STKYAASLPA LLLARHTGGK IESIPLASTH AQDIVQIITD ALLEHPPEIT 600 

VENLPSYPRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTPVGRGIL 660 

RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD QAIIEENI.VL WLKKLEAGLE NKITILPAQE 720 

WKPPLPAYDP LSMIDAATSQ RGTRKVPKCM KETDVQENDK EQHKDKSAVR KEPIETLRIK 780 
HWNRSNWFKE AEKSFRRDKE LGCSKVN 

SEQ ID NO27PAA50NA SEQUENCE 

Nucleic Add Accession f: NMJ012449 

Codng sequence: 65-1085 (undefined sequences correspond to start and stop colons) 

1 11 21 31 .41 51 

I I I I I I 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAQ 60 

AATT AATGG A AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC AIAAGGACAC GGGAGAGAGC AGCATGCTAA 180 

AAAGACCTGT GCTTTTGCAT TTGCACCAAA CAGCCCATGC TGATGAATTT GACTGCCCTT 240 

CAGAACTTCA GCACACACAG GAACTCTTTC CACAOTSGCA CTTGCCAATT AAAATAGCTQ 300 

CTATTATAGC ATCTCTGACT TTTCTTTACA CTCTICTGAG GGAAGTAATT CACCCTTEAG 360 

CAACTTCCCA TCAACAATAT TTTTATAAAA TTCCAATCCT GGTCATCAAC AAAGTCTTGC 420 

CAATGGTTTC CATCACTCTC TTOOCATTQG TTTACCTGCC AGGTGTGAIA GCAGCAATTG 480 

TCCAACTTCA TAATGGAACC AAGTATAAGA AGTTTGCACA TTGGTTQGAT AAGTGQATGT 540 

TAACAAGAAA GCAGTTTGGG CTTCTCACTT TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

QTCTGTCTTA CCCAATGAGG CGATCCTACA GATACAAGTT GCTAAACTGG GCATATCAAC 660 

AGGTCCAACA AAATAAAGAA GATCCCTGGA TTGAGCATGA TGTTTGGAQA ATGGAGATTT 720 

ATGTGTCTCT GGGAATTGTG CGATTGGCAA TACTGGCTCT GTTGGCTOT6 ACATCTAWC 780 

CATCTGTGAG TQACTCTFTG ACATGGA6AG AATTTCACTA TATTCAGAGC AA6CTAGQAA 840 

TTGTTTCCCT TCTACTGGGC ACAATACAOO CATTGATTTT TGCCTGGAAT AAGTGGATAG 900 

ATATAAAACA ATTTGTATGO TATACACCTC CAACTTTTAT GATAQCTGTT TTCCTTCCAA 960 

TTGTTGTCCT GATATTTAAA AGCATACTAT TCCTGCCATG CTTGAGGAAG AAQATACTCA 1020 

AOATTAGACA TGGTTGGGAA GACGTCACCA AAATEAACAA AACTCAOATA TGTTOCCAGT 1080 

T GTAGA ATTA CTGTTTACAC ACATTTTTGT TCAATATTGA TATATTTTAT CACCAACA1T 1140 
TCAAGTTTGT ATTTGTTAAT AAAATQATTA TTCAAGGAAA AAAAAAAAAA AAAAA 

1 11 21 31 41 51 

I I I I I I 

MESEKDITNQ EEUWKKKPBR NLEEDDYLHK DTGETSHLKR PVLLHLHQTA HADEFDCPSE 60 
LQHTQELFPQ WHLPIKIAAI IASLTFLYTL LREVXHPLAT SHQQYFYKIP ILVIUKVLEW 120 

VSITLLALVJ LPGVIAAIVQ LH MgHHKKP PHWLDKKHLT RKQFGLLSPF FAVLHAIYSl* 180 

SYPMRRSVRV KLLNWAYQQV QQNKEDAWIE HDVWHMBIYV SLGIVGLAIL ALLAVTSIPS 240 
VSDSLTWREF HYIQSKLGIV SLLLGTIHAL IFAWNKWIDI KQTWYTPPT FMIAVFLPIV 300 
VUFKSILPL PCLEKKILKI RHGWEDVTKI NKTEICSQL 

SEQ ID N029 PAA7 ONA SEQUENCE 

Nude* Add Accession* NM.030774 

Coding sequence: 1-963 (underlined sequences corespond to start and stop codons) 

1 11 21 31 41 51 * 

I I I I I I 

ATGA GTTOCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 120 

AACTGCATCG TGGTCTTCAT CGTAAGGACG GAACGCAGOC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

CTTGOCCTTT TCTGGTTTGA TTCCCGA6A6 ATTAGCTTTG AGGCCTGTCT TACCCAfiATG 300 

TTCTTTATTC ATGCCCTCTC AGGCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

CGTTATOTGG CCATCTGCCA CCCACTGCGC CATGCTGCAQ TGCTCAACAA TACAGTAACA 420 

GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTTCCC ACTGCCTCTG 480 

CT6ATCAAGC GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTCCAC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCOCA ATGTGGTEATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCPA TTTTCTGATA 660 

ATACQAACGG TTCTGCAACT GCCTTCCAAG TCAGAGCGGS CCAAGGCCTT TGGAACCTGT 720 

GTGTCACACA TTGGTGTGGT ACTCGCCTTC TATGTGCCAC TTATTGGCCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG TCATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGACCCTTAA CACTACACTT CTCCTTATCT TTA3TGGCTT GATAAACATA ATTATTTCTA 1020 

ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA GATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TTCTTTTTTT 1320 

TTTATTATGG TTAGCTGTCA CATACAACTT TTTTTTTTTT TCAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGOGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGTT 1440 

GAAJGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCAOCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAGTGTT GGGATTACAG 1620 

GTOTGAACCA CTGTGCCCGG OCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATOGAATAAC ATATCAAATG AAACAGGGAA 1860 

AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

TACCTGGGAA TTTATATAAG CCCTFAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATQAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTX3TGGGCAC TGTGCCAAGA TTTAAAATTA AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATCTTAT ATTGCTTOCT GTGTAACATC TGCCA TTTAT 2220 

TTCCTCAGCT GTACAAATCC TCliyiTWCt' CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTGTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTACAGCTG CCTTTCGTGA TCTTATTGCT TGCTTTTTTC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

CTGTCAAAAA TTTTGAATGT ACACCACATG CTATTGTCTG AACTTGAGTA TAAGA1AAAA 2520 
TAAAATTTTA TTTTAAATTT T 
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SEP ID NO:30 PAA7 PROTEIN SEQUENCE 

Protein Accession* NP_l 10401 

5 1 11 21 31 41 51 

I I I I I I 

MSSCNFTHAT PVLIGIPGLE KAHFWVGFPL LSMYWRMFG NCIWPIVRT ERSLHAPHIL 60 
PLCMLAAIDL ALSTSTHPKI LALPWPDSRE ISFBACLTQM PFIHALSAIB STILLAMAFD 120 
RYVAICUPLR HAAVLNNTVT AQIOIVAWR GSLFPPPLPL LIKRLAPCHS NVLSHSYCVH 1B0 
10 QDVMOAYAD TLPNWYGLT AILLVMGVDV HFISLSYPLI IRTVLQLPSK SERAKAPGTC 240 
VSHIGWLAP YVPLIGLSW HKPGNSLHPI VRWMGDIYIi LLPFVmPII YGARTKQIRT 300 
KVLAHFKISC DKDLQAVGGK 

SEO ID Nfc31 PAV6 DNA SEQUENCE 

IS Nucleic Add Accession* XHJB50837 

Coding sequence: 1-1020 (underlined sequences correspond to slart andstop codons) 

1 11 21 31 41 51 

20 ATGAACTGGS laCTCCTGCT GTGGCTGCTG GTG C TGTQCQ CGCTGCTCCT GCTCTTGGTG 60 

CAGCTCCTGC GC TTCCT G AO GGCTGACGGC GACCTOACGC TACTATGGGC CGAGTGGCAG 120 

GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 180 

GQAATTGGTG AGQAOCTGGC TOACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 240 

_ _ GCCAGAAGAG TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAGAA TGGCAATTTA 300 

25 AAAGAAAAAG AZATACTTGT TTTGCCCCTT GAOCTGACCG ACACTGGTTC CCATGAAGCC 360 

GCTACCAAAG CTGTTCTOCA GGAGTTTGGT AQAATCGACA TTCTGGTCAA CAATGCTGGA 420 

ATGTCOCAGC GTTCTCTGTG CAXGGATACC AGCTTGGATO TCTACAGAAA* GCTAATAGAG 480 

CTTAACTACT TAGGGACGGT GTCCTT G ACA AAATGTGTTC TGOCTCACAT QATCQAQAGQ 540 

AAGCAAGGAA AGATTOTTAC TGTGAATAGC ATCCTGGSTA TCATATCTGT ACCTCTTTCC 600 

30 AITGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACASAA 660 

CTTGCCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 720 

ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 780 

TCCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATGAT 840 

_ TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCA ATAC 900 

35 ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGGA AGAAAAGGAT TGAGAACTTT 960 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTTAAAATCT TTAAGACAAA ACATGACTGA 

SEQ ID NQ32 PAV6 Protein sequence 
Protein Accession* XP_0S0837 

40 

1 11 21 31 41 51 

1 I i I I I 

MUWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GHRPEWELTD MWWVTGASS 60 
GIGBEUXQL. SKLGV5LVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DI/TDTGSHEA 120 
45 ATKAVLQEFG RIDILVKNGG MSQRSLCHDT SIJJVXKKLIE LNYLGTVSLT KCVLPHtHBR 180 
KQGK1VTVNS XLGIISVPLS XGYCASKHAL KGFFNGLRTE IATYPGIIVS NICPGFVQSN 240 
IVEHSIAGEW TKTIGNNGDQ SHKHTTSRCV RLMLISMAND LKEVHXSEQP FLLVTYLWQY 300 
MPTWAWWrm KMOKKRIENF KSGVDADSSY FHFKTKHD 

50 SEQ ID N033 PBA6 DNA SEQUENCE 

Nucleic Add Accession #: NM.006853 

Coding sequence: 26-874 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

55 | | | I I I 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCXGGGGCC CGCTCCTCCC CCCTOCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

, CAGGATCATC AAGGGGTICG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 240 

60 CGAGAAGACG CGGCTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAGC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 360 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTOCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTCCCC AACAAAGACC ACCGCAATGA CATCATGCTG GTGAAGATGQ CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC S40 

65 CAGCTGCCIC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CACTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCT0CTGGGG 780 

CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

70 GGACTGGATC CAGGAGACGA TCAAGAACAA TTAGACTGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACXAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 

75 TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 

SV. Q TP NO:34 PBA6 PROTEIN SEQUENCE 

Protein Accession f. NPJD06844 
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1 11 21 31 41 51 

I I I I I I 

HRILQLILLA LATGLVGGET RI1KGFECKP HSQFWQAALF EKTRLLCGAT LIAPRWLLTA 60 

AHCLKPRYTV HLGQHNLQKE EGCEQTRTAT BSFFHPGEHH SLPHKDHEND IMLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWQST SSPQLRUHT LRCANITXIE HQKCEMAYPG 180 

NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG QDPCAITRKP GVYTKVCKYV 240 
DWIQETMKKN 

SEQ ID K035 PBC1 DNA SEQUENCE 

Nudete Acid Accession f: NM_001775 

Coding sequence 70-972 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CTAAAGCTCT CTTGCTGCCT AGCCTCCTGC CGGCCTCATC TTCOOCCAGC CAACCGCGCC 60 

TGGAGCCCTA TGGCCAACTG CGAGTTCAGC CCGGTGTCCG GGGACAAACC CTGCTGCCGG 120 

CTCTCTAGGA GAGCCCAACT CTGTCTTGGC GTCAGTATCC TGGTOCTGAT CCTCGTCQTG 180 

GTGCTOGOGG TGGTCGTCOC GAGGTGGCGC CAGAOGTGGA GCGGTCCGGG CACCACCAAG 240 

CUL1TO3UUU AGACCGTCCT GGCGCGATGC GTCAAGTACA CTGAAATTCA TCCTGAGATG 300 

AGACATGTAQ ACTGCCAAAG TGTATGGGAT GCTTTCAAGG GTGCATTTAT TTCAAAACAT 360 

CCTTGCAACA TTACTOAAQA AGACTATCAG CCACTAATQA AQTTGGOAAC TCAGACCGXA 420 

OCTTGCAACA AGATTCTTCT TTCGAGCAGA ATAAAAOATC TGGCOCATCA GTTCACACA6 480 

GTCCAGCGGG ACATGTTCAC OCTGGAGGAC ACGCTGCTAG GCTACCTTGC TGATGACCTC 540 

ACATGGTGTG GTGAATTCAA CACTtCCAAA ATAAACTATC AATCTTOCCC A6ACTG6A0A 600 

AAGGACTGCA GCAACAACCC TGTTTCAGTA TTCTCGAAAA CGGTTTCOCG CAGGTTTGCA 660 

GAAGCTGCCT GTGATGTGGT CCATGTGATG CTCAATGGAT CCCGCAGTAA AATCTTTGAC 720 

AAAAACAGCA CTTTTGGGAG TGTG0AA6TC CATAATTTGC AACCAQAGAA GGTTCAGACA 780 

CTAQAGGCCT GGGTGATACA TGGTGGAAGA GAAGATTCCA GAGACTTATG CCAGQATCCC 840 

ACCATAAAAG AGCTGGAATC GATTATAAGC AAAAGGAATA TTCAATTTTC CTGCAAGAAT 900 

AICTACAGAC CTGACAAGTT TCTTCAGTGT GTQAAAAATC CTGAGQATTC ATCTTGCACA 960 

TCTGAGATCT GAG CCACTCO CTGTGGTTGT TTTAGCTCCT TGACTCCTTG TGGTTTATGT 1020 

CATCATACAT GACTCAGCAT ACCTGCTGGT GCAGAGCTGA AGATTTTGGA GGSTCCTCCA 1080 

CAATAAGGTC AATGCCAGAG ACGGAAGCCT TTTTCCCCAA AGTCTTAAAA TAACTTATAT 1140 

CATCAGCATA CCTTTATTGT GATCTATCAA TAGTCAAGAA AAATTATTGT ATAAGATTAG 1200 
AATOAAAATT GTATGTTAAG TTACTTCCTT TAG 
r 

SEQ ID HO:38 PBC1 Protein seouenoe 
Protein Accession f: NP.001768 

1 11 21 31 41 51 

I I I I I I 

UANCEFSPVS GDKPCCRLSR RAQLCLGVSI LVtltVWLA VWPBWRQTW SGPGTTKHFP 60 

ETVLARCVKY TEIHPEHHHV DCQSVWDAFK GAPISKHPCN ITEEDYQPLM KLGTCTVPCN 120 

KILLWSRIKD LAHQFTQVQR EMFTLEDTLL GYLADDLTWC GEFIITSKINY QSCPDWRKDC 180 

SNNPVSVFHK TVSRRFAEAA CDWHVMLNG SRSKIFDKNS TFGSVEVBNL QPEKVQTLEA 240 
WIHGGREDS ROLCQDFTZK ELESIISKRN IQFSCKNIYH FDKFLQCVKH PEDSSCTSEI 

SEQ ID N037 PBH1 0NA SEQUENCE 

Kudeic Add Accession*: XMJD17718 

Coring sequence: 1-331S (undsrtned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGTCCTTTC GGGCAGCCAG GCTCAGCATG AGGAACAGAA GGAATGACAC TCTGGACAGC 60 

ACCCGGACCC TGTACTCCAG CGCGTCTCGG AGCACAGACT TGTCTTACAG TGAAAGCGAC 120 

TTGGTGAATT TTATTCAAGC AAATTTTAAG AAACXSAGAAT GTGTCTTCTT TACCAAAGAT 180 

TCCAAGGCCA CGGJAOAATGT GTGCAAGTGT GGCTATGCGC AGAGCCAGCA CATGGAAGGC 240 

ACCCAGATCA ACCAAAGTOA GAAATGGAAC TACAAGAAAC ACACCAAGGA ATTTCCTACC 300 

GACGCCTTTG GGGATATTCA GTTTGAGACA CTGGGGAAGA AAGGGAAGTA TATACGTCTG 360 

TCCTGCGACA CGGACGCGGA AATCCTTTAC GAGCTGCTGA CCCAGCACTQ GCACCTGARA 420 

ACACCCAACC TGGTCATTTC TSTGACCGGG GGCGCCAAGA ACTTCGCCCT GAAGCCGCGC 480 

ATGCGCAAGA TCTTCAGCCG GCTCATCTAC ATCGCGCAGT CCAAAGGTGC TTGGATTCTC 540 

ACGGGAGGCA OCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

ATCAGCAGGA GTTCAGAGGA GAATATTGTG GCCATTGGCA TAGCAGCTTG GGGCATGGTC 660 

TCCAACCGGG ACACCCTCAT CAGGAATTGC GATGCTGAGG GCTATTTTTT AGCCCAGTAC 720 

CTTATGGATG ACTTCACAAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTTG 780 

CTGCTCGTGG ACAATGGCTG TCATGGACAT CCCACTGTCG AAGCAAAGCT CCGGAATCAG 840 

CTAGAGAAGT ATATCTCTGA GCGCACTATT CAAGATTCCA ACTATGGTGG CAAGATCCCC 900 

ATTGTGTGTT TTGCCCAAGG AGOTGGAAAA GAGACTTTGA AAGCCATCAA TACCTCCATC 960 

AAAAATAAAA TTCCTTGTGT GGTGGTGGAA GGCTCGGGCC AGATCGCTGA TGTGATCGCT 1020 

AGCCTGGTGG AGGTGGAGGA TGCCCTGACA TCTTCTGCCG TCAAGGAGAA GCTGGTGCGC 1080 

TTTTTACCCC GCACGGTGTC CCGGCTGCCT GAGGAGGAGA CTGAGAGTTG GATCAAATGG 1140 

CTCAAAGAAA TTCTCGAATG TTCTCACCTA TTAACAGTTA TTAAAATGGA AGAAGCTGGG 1200 

GATGAAATTG TGAGCAATGC CATCTCCTAC GCTCTATACA AAGCCTTCAG CACCAGTGAG 1260 

CAAGACAAGG ATAACTGGAA TGGGCAGCTG AAGCTTCTGC TGGAGTGGAA CCAGCTGGAC 1320 

TTAGCCAATG ATGAGATTTT CACCAATGAC CGCCGATGGG AGTCTGCTGA CCTTCAAGAA 1380 

GTCATGTTTA CGGCTCTCAT AAAGGACAGA CCCAAGTTTG TCCGCCTCTT TCTGGAGAAT 1440 

GGCTTGAACC TACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC 1500 

TTCAGCACGC TTGTGTACCG GAATCTGCAG A3CGCCAAGA ATTCCTATAA TGATGCCCTC 1560 
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CTCACGTTTG TCTGGAAACT GGTTGCGAAC TTCCGAAGAG GCTTCCGGAA GGAAGACAGA 1620 

AATGGCCGGG ACGAGATGGA CATAGAACTC CACGAOGTGT CTCCTATTAC TC6GCAOCCC 1680 

CTGCAAGCTC TCTTCMCTG GGCCATTCTT CAGAATAAOA AGGAACTCTC CAAAGTCATT 1740 

TGGGAGCAGA CCAGGGGCTG CACTCTGGCA GOOCT GGG AG CCAGCAAGCT TCTGAAGACT 1800 

CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTQQGGAGT CCGAGOAGCT GGCTAATGAG 1860 

TACGAGACCC GGGCTGTTGA GCTOTTCACT GAGTGTTACA GCAGCGATGA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTGTGAAGCT TGGGGTGGAA GCAACTGTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAOCCTGGCO TCCAGAATTT TCTTTCTAAG 2040 

CAATGCTATG GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT C WM WU M U' 2100 

ATTATACCCT TGGTGGGCTG TGGCTTTGTA TCATTTAGGA AGAAACCTGT CGACAAGCAC 2160 

AACAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC OCTTCGTGOT CTTCTCCTGG 2220 

AATGTGGTCT TCTACATOGC CTTCCTCCTG CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTCGGTGC CACACCCCCC CGAGCTGGTC CTGTACTCGC TGGTCTTTGT CCTCTTCTGT 2340 

GATGAAGTGA GACAGTGGTA CGTAAATGGG GTGAATTATT TTACTGACCT GTGGAATGTG 2400 

ATGGACACGC TGGGGCTTTT TTACTTCATA GCAG6AATTG TATTT0G6CT CCACTCTTCT 2460 

AAXAAAA6CT CTTTGTATTC TGGACQAGTC AT TTTC T G TC TGGACTACAT TATTTTCACT 2520 

CTAAGATTGA TCCACATTTT TACTGTAAGC AGAAACTTAO GACCCAAGAT TATAATGCTG 2580 

CAQAGOATGC TGATOGATGT GTTCTTCTTC CTGTTCCTCT TTGCGGTGTQ GATGGTGGCC 2640 

TTTGGCGTGG CCAGGCAAGG GATCCTTAGG CAGAATGAGC AGCGCTGGAG GTGGATATTC 2700 

CGTTCGGTCA TCTACGAGCC CTACCTGGOC ATGTTCGGCC AGGTGOOCAG TGACGTGGAT 2760 

GQTACCACGT ATGACTTTGC CCACTGCAOC TTCACTGGGA ATGAGTCCAA GCCACTGTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCOCCGG TTCCCCGAGT GGATCACCAT CCCCCTGGTG 2880 

TGCATCTACA TGTTATCCAC CAACATCCTG CTGGTCAAOC TGCTGGTCGC CATGTTTGGC 2940 

TACACGGTGG GCRCCGTCCA GGAGAACAAT GACCAGGTCT GGAAGTTCCA GAGGTACTTC 3000 

CTGGTGCAGG AGTACTGCAG CCGCCTCAAT AICCCCTTOC CCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTGCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCTGCT GTTTCAAAAA TGAAGACAAT. GAGACTCTGG CATGGSAGGG TCTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT, -CAACACAAAA GCCAACGACA OCTCAGAGGA AATGAGGCAX 3240 

CGATTTAGAC AACTGGATAC AAAGCTTAAT GAICTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
AATAAAATCA AATGA 

SEQ P H0-.38 PBH1 Protein sequence 
Protein Accession* XPJD17718 

1 11 21 31 41 51 

I I I I I I 

HSFRAARLSM RNRENDTLDS TRTLYSSASR STDLSYSKSO LVNFIQANFK KRECVFFTKD 60 

SKATENVCKC GYAQSQHMEG TQINQSEKHH YKKHTKEFPT DAFGDIQFET LGKKGKYIRI, 120 

SCDTDAHILY ELLTQHWHLK TPNLVISVTG GAKHFALXPR MKKIPSKLIY IAQSKGAM1L 180 

TGGTHYGLHK YIGEWHDNT ISRSSEENIV AIGIAAWGMV SKRDTLIRNC DAEGYFIiAQY 240 

LHDDFTRDPL YILDNNHTOL LLVDNGCHGH PTVEAKLRNQ LEKYISERTI QDSNYGGKIP 300 

IVCFAQGGQK ETLKAHJTSI KNKIPCVWE GSGQIADVIA SLVEVEDALT SSAVKEKLVR 360 

FLPRTVSRLP BBBTESH1KW LKEILECSHL LTVIKHEEAG DEIVSHAXSY ALYKAFSTSE 420 

QOKDHWNGQL KM.LEWNQLD LANDE1PTND RRWESADLQE VHFTALIKDR PKFVRLPLEN 480 

GLNLRKFLTH DVLTELPSNH FSTLVYRNLQ IAKNSYNDAL LTFVWKLVAN FRRGFRKEDR 540 

HGRDEMDIEL HDVSPITRHP LQALFXWAIL QHKKELSKVI HEQTRGCTLA ALGASKLLKT 600 

LAKVKNDUZA AGESEELANB YETRAVELFT ECYSSDEDLA EQLLVTSCKA HGGSNCLBLA 660 

VEATDQHPIA QPGVQUFLSK QWYGEISRDT KHWKIILCLP IIPLVGCGFV SPBKKPVDKH 720 

KKLLHYYVAF FTSPFWPSW NWFYIAFLL LFAYVLU1DF HSVFHPPEXiV LYSLVFVLFC 780 

DEVRQWYVNG VNYFTDLWNV HDTLGLFYPI AGXVFRLHSS NKSSLYSGKV IFCLDYIIFT 840 

LRLIHIPTVS HNLGPKIZHL QRHLXDVFFF LFLFAVWMVA FGVARQGILR QNEQEWRWIF 900 

RSVTYBFYLA HFGQVPSDVD GTTYOFAHCT FTGNESKPLC VELDEHNLPR FPEWTTIPLV 960 

CIYKLSmtL LVNIJ.VAKFG YTVGTVQENN DQVWKFQRYF LVQEYCSRLN IPFPFIVFAY 1020 

FYHWKKCFK CCCKEKHKES SVCCFKNEDN ETLAHEGVKK EHYLVKIOTK AHDTSEKKBH 1080 
RPRQLDTKLN DLKGLLKEIA HKIK 

SEQ ID N039 PBH3 DNA SEQUENCE 

Nucleic Add Accession* XM_0U804 

Coding sequence: 1-S58(undaflned sequences correspond Is start and stop colons) 

1 11 21 31 41 51 

I I I I I I 

ATGCCTCGCC TGTTCTTGTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

AGAGCAGTCG CGGCCAAATG GAAGGACQAT GTTATTAAAT TATSCGGCCG CGAATTAGTT 120 

CGCGOGCAGA TTGCCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGOCACCGGA GCTGAASGCA 300 

GGCCTATCTG AGAGGCAACC ATC ATTAC CA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAATAGGCA AAGTGAAGCC 420 

GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGGCTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT ACGTGGCACT GTTTGAGAAA TGTTGCCTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTGCTGA 

SEP ID NO;4Q PBH3 PROTEIN SEQUENCE 

Protein Accession l: NP_008842 

1 11 21 31 41 51 

I I I I I I 

HPRLPLFHLL EFCLLLNQPS RAVAAKHKDD VIKLCGRBLV RAQIAICGKS THSKRSLSQE 60 
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DAPQTPRPVA EIVPSFINKD TETIITHT.RP XANLPPELKA ALSERQPSLP BLQOYVPALK 120 

DSNLSFEBFK KLIHHRQSEA ADSNPSELKY LGLUIBSQKK RRPYVALPEK CCUGCTKRS 180 
LAKYC 

SEQ ID KO:41 PBH5 DNA SEQUENCE 

Hudelc Add Accession r. NM_00584S 

Coding sequence 1^78 (umMnedsea^encescarresporKlk) start and stop codons) 

1 11 21 31 41 SI 

I I I I I I 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGAGGOGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTGGTGG C T CAATCCCTTG TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAGATGATA TGTATTCAGT GCTGCCAGAA GACCGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTCT GGGAXAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTPCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 300 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTGG GAAAAATTAT TAATTATTTT 360 

GAAAATTATG ATCCCATGGA TTCTGTGGCT TTGAACACAG CGTACGCCTA TGCCACGGTG 420 

CTGACTTTTT GCACGCTCAT TTTGGCTATA CTGCATCACT TATATTTTTA TCACGTTCA6 480 

TGTGCTGGGA TGAGGTTACG AGTAGCCATO TGCCAIATGA TTTATCCOAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACASGCCAGA TAGTCAATCT GCTGTCC A AT 600 

GATGTGAACA AGTTTGATCA GGTGACAGTG TTCTTACACT TCCTGTGGGC AGGACCACTG 660 

CAGGOSAICG CAGTGACTGC CCTACTCTGG ATGGAGAIAQ GAATATCGTG CCTTGCTGGQ 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 780 

CW3AGGASTA AAACTGCAAC TTTCACGQAT GCCAGGATCA GGACCATQAA TGAASTTATA 840 

ACTGGTATAA GGATAAXAAA AATGTACGQC 7GGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTOAGAAGTT CCTGCCTCAQ GGGGA3GAAS 960 

TTGGCTTCCT TTTTCAGTGC AAGCAAAATC ATCGTGTT TC TGACCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGOCAGCCGC GTGTTCGTGG CAGTGAOGCT GTATGGGGC? 1080 

GTGCGGCTGA CGOTTACCCT CTTCTTCCCC TCAGCCAXTG AQAGGGTGTC AGAGGCAAXC 1140 

GTCAGCATCC GAAGAATCCA GA CCTTTTT O CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC TTTTTGGGAT 1260 

AAGGCATCAG AGAGCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGCCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGOCTA TGTGTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGTGCTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 1620 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGTGCAGA AGGGGACTTA CACTGAGTTC 1860 

CTAAAATCTG GTATAGATTT TGGCTCCCTT TTAAAGAAGQ ATAATGAGGA AAGTSAACAA 1920 

CCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TRTAAGAATT ACTTCAGAGC TGCTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATGTAAGCGA GAAGCTAGAT 2280 

CTEAAC7GGT ACTTAGGAAT TTATTCAGGT TTAACTGTAG CTAOCGTTCT TTCTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGSA CACTTGGATG ATTTQCTGCC GCTGACGTTI 2520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TCTGGCCGTG 2580 

ATTCCTTGGA TCGCAATACC CTTG G TTCCC CTTGGAATCA TTTICATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCQ GAGTOCAGTG 2700 

TTTTCCCACT TGTCATCTTC TCTCCAGGGG CTCTGGACCA TCCGGGCATA CAAAGCAGAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTGGTTCTTG* 2820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC CGTCTGGATG CCATCTGTGC CATGTTTGTC 2880 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTATGCCCT CAOGCTCATG GGGATGTTTC AGTGGTGTGT TCGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC OCACCACCAG CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCCTCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTGGATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

OCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AAOCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAATCAG GATCCAATTT TAGTQTTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATCOGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGCGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGSCAGAA 3840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG CACTGTGA 
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SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Protein Accession* NPJD05836 

1 11 21 31 41 51 

I I I I I I 

MLPVYQBVKP KFLQDANLCS RVPPWWLNPL FKIGHKRRLE BDDMYSVLPE DRSOHLGEEL 60 

QGFWDKEVLR AENHRQKPSL TRAIIKCYWK SYLVLGIPTL IEESAKVIQP IPLGKIIHYF 120 

EHYDPHDSVA LNTAYAYATV LTFCTLILAI LHHLYFYHVQ CAGHRLKVAM CHMIYRKALR 180 

LSQ4AHGKTT TGQIVNLLSN DVNKFDQVTV FLHFLWAGPL QAXAVTALLtf HEIGISCIAG 240 

HAVLIILLPL QSCFGKLFSS LRSKTATFTD ARIR1MNEVI TGIRIIKMYA WEKSFSNLIT 300 

NLHKKEISKI LRSSCLRGMN LASFFSASXI IVPVTPTTW LLGSVITASR VFVAVTLYGA 3S0 

VRLTVTLFFP SAIEKVSEAI VSIRSIQTFL LLDBISQRNR QLPSDGKKMV HVQDFTAFWD 420 

KASETPTLQG LSPTVRPGBL LAWGPVGAG KSSLLSAVLG ELAPSEGLVS VHGRIAYVSQ 480 

QPWVFSGTLR SHILFGKKYE KERYHKVXKA CALKKDI/QLL KDGDLTVIGD RGTTLSGGQK 540 

AKVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFBLCXCQ ILHEKITItV TSQLQYLKAA 600 

SQIUUC06K HVQKOTVTBP LKSGIOFGSL LKKDNEESEQ PPVPOTPTLR NRTPSESSVW 660 

SQQSSRPSLK DGALESQDTB KVPVTLSEEN RSEGKVCTQA YKNYPRAGAH WIVPIPLIIX 720 

HTAAQVAYVt, QDWWLSYWAN KQSMLNVTVN GOGNVTEKLD LNWYLGIYSG LTVATVLFGI 780 

ARSLLVFYVL VNSSQTXHNK KFESILKAFV LFFDBNPIGR ILNRFSKDIG HLDDLLPLTP 840 

LDPIQTLL0JV VGWSVAVAV IPWIAIPLVP LGIIPIFLHR YPLETSHDVK RU3STTRSPV 900 

PSBLSSSLQO LWTIRAYKAE BRCOBLFDAH QDLHSEAKPL FLTTSRHFAV RLDAICAMFV 960 

IIVAFGSLIL AKTLDAGQVG LALSYALTIii GMFQWCVRQS AEVKNMHISV BRVIEYTDLE 1020 

KEAPWEYQKR PPPAWPKEGV IIFIWVKFMY SPGGPLVLKK L/TALIKSQEK VQIVGRTQAG 1080 

KSSLISALFR LSEPBGKIWI DKILTTEIGL HDUiKKMSII PQBPVLFTGT MRKNLDPFNE 1140 

HTDEELKMAL QBVQLKETIB DLPGKHDTEL AESGSNKSVG QRQLVCLAEA ILRKNQILII 1200 

DEATANVDPR TDBLIQKKIR EKFAHCTVLT IAHRLNTIXD SDKIBVLDSO RLKEYDEPYV 1260 

LLQNKKSLFY KHVQQLGKAB AAALTETAKQ VYFKKNYPHI GHTDHMVTNT SNGQPSTLTI 1320 
FETAL ' ' 

SEO 10 HO:43 PBQ70NA SEQUENCE 

Kudetc Add Accession #: NM_02l233 

Coifing sequence: 34-1119 (unfeib^ sequences coriesporallo start and stop codarts) 

1 11 21 31 41 51 

I I I I I I 

ATSSQGAAAG TGTCCTGCTG TGGCATGAAA TAAATCAAAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT CCTTTGCTTT GCTCTTCCTT GGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATPICATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTACCT AGACTCTACA 240 

ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA OCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

TTACTGCTGT GGAACAGAGT TCAAGGGTTC TGGCTGATTC ATTCCATCCC TCAGTTTCCT 480 

CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GACGAAATGG ACAAAGTGGC 540 

ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAAXAG ATTCTCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCCCA GCCACCTTTC AOCAGGAGCT CATTCACATG 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT OGGATTCTTT TCTTGACGAG 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTG AAGACACACT TGTTAACAGA AACCTGGCAG 840 

CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCCTTCCTT ACCATGTCTA CAAXATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATICATT TGTACCCAGA ATTGGCAAAT TTACCAAGCA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAA GTAAA CTTGGTGAAA GGACACAGGT ' 

SEO ID HQ-M PBQ7 Protein seouenca 
Pideln Accession #: NP_067058 

1 11 21 31 41 51 

I I I I I I - 

MMARLLRTSP ALLFLGLFGV LGAATISCRN EEGKAVDWFT PYKLPKRQNK ESGETGLEYL 60 

YLOSTTRSKR KSEQLMNDTK SVLGRTLQQI, YEAYASKSNN TAYLIYNDGV PKPVNYSRXY 120 

GHTKGLLLWN KVQGFWLIHS IPQPPPIPBE GYDYPPTGRR NGQSGIC1TF KYNQYBAIDS 180 

QLLVCNPNVY SCSIPATFHQ ELIHHPQLCT HASSSEIPGR LLTTLQSAQG QKFLBFAKSD 240 

SPUJDIFAAW MAQRLKTHLL TETWQRKRQE LPSNCSLPYH VYNXKAIKLS RHSYFSSYQD 300 
HAKWCISQKG TKNRWTCIGD LNRSPHQAFR SGGFICTQNW QIYQAFQGLV LYYESCK 

SEQID K(M5 PCQ8 DNA SEQUENCE 

Nudge Add Accession t. XM_03O«3 

Cooing sequence 83-1273 (undeSned sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

CGGTGCCCTG GGGTGGAATA TCCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTCACG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGOTTTG GTTCAGTGAA AAXATGTTTG GACCAGATTT 240 

CAGTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT TTGCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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CATCACCTAC CAGAGCAAGC TGGCCAAGGA CCTOCTGQAC ACCATCCTAG CCATCCAACC 420 

CAAGGACACC TCTGGTOGAG GGGATGAGAC CCGGQAGGCG GTGGTGGCCC GGCTGGCTGA 480 

TGATATGCTQ GAOAAGCTGC CCCCAGACTA TGT CCC CTTT GAAGTAAAAG AGAGGCTGCA 540 

GAAGATGGGG GCATTCCAGC CTATGAACAT TTTCCTCAGG CAGOAAATAG ACAGAATGCA 600 

AAGGGTACTC AGCCTTGTCC GCAGCACCCT CACTGAGCTG AAACTTGCTA TTGATGGCAC 660 

CATCATCATG AGCGAAAATC TGCAAGATGC ATTGGATTGC ATGTTTGATG CTAGAATCCC 720 

TGCTTGGTGG AAAAAAGCTT CTTGGQTTTT TAGTACACTG GGT TTC T GG T TTACTGAACT 780 

TATAGAAAGA AACAGCCAGT TTACCTCGTQ GGTTTTCAAT GGCCGACCTC ACTGCTTTTG 840 

GATGACGGGT TTTTTTAACC QCCAGGGATT TTTAACTGCA ATGCGACAGG AAATAACTCG 900 

GGCCAACAA& GGCTGGGCTC TGGACAATAT GGT G CTTTGC AATQAAOTCA CCAAATGGAT 960. 

GAAGGACGAC ATTTCTACCC CTOOCACAGA GGGTGICTAT GTCTATGGCT TATMCTTGA 1020 

AGGTGCTGGC TGGGACAAGA GGAACATGAA ACTCATT6AA TCAAAGCCAA AAGTGCTCTT 1080 

TGAGTTGATG CCTGTCATAA GQATTTATGC AGAAAACAAT ACTTTAOGAO ATCCTCGGTT 1140 

TTACTCCTGT CCCATCTATA AGAAGCCAGT TCGAACGGAC TTGAACTACA TTGCCGCTGT 1200 

GGATCTCAGG ACAGCCCASA OCCCTGAACA CTGGGTOCTC CGTGGGG T TG CCCTTCTGTG 1260 

TGATGTCAAQ TAACATGTGG GQAGTGTOCC CACCCAATGC TTTOOAAAAT GCAAGATCTA 1320 

AATTATTGTA ACCTTTATTT CTGTATGACT GCTGGACAGT GTATGTTAGG TCGTTTATGC 1380 

AATTAATGAG CTGCATAGGT TTPCCCCACT CCTTAAT7GG ATGCTTATAT TTTACTTGTT 1440 

TCATCATTAG TGAOCAATGT CTGAGTTTST TOAAAATGTT ATTTAGTGAT ATAAAAGTAA 1500 

ATTTACAGCA TCCTAATGAA GTGTGGCOC T CAAATGCACA GTAGTATATT TTCTTCTTAC 1560 

TTCGCTCCGA AGACTGACTG TGATTATAAC AGCAAAXASA TTTGCATQTG GACAAAGATT 1620 

AGATGGCAAG ATAGAAAAAT AAGAACAGAT GTGAXAGCAA GAATTATAGT TGGCTTGAAA 1680 

AAATGTQATG ATCAGGAGAA AAAATAAAAA AAGGGTAGAA AXATTABACG GTGCGTAGGG 1740 

ACTTTCTATG GACTTTTATT AATTAGGAAA CATTATCAAA GGAACTTTTC ACGTATTTTT 1800 

CTTTAAATTC TGGTTAGATG TTATTAATAA TTCTTCATCT AACCTACTGA CTAGAAAAXA 1860 

TAGTCAGTAC TAAATTAGAA T J U T UUTTI A TAAACTTTTO GTTAGCTCTG GATCTGTATA 1920 

ACTGCATTTT TTTGGATAAA CAGTTTTTGG TAGGTGGATA GCGGGAGACA AGTGTGGGTC 1980 

CCTCTCACTO GGCTTCATTC TGTGGACCAG GATCATTATT TCATGCTCAT GATCATGAGA 2040 

GTTAGGACTG AGTGGCTCCT GTGACTCCCA CCATCTTAGA TGATACTGTT TTCTTGTGAG 2100 

TIC T flCTl'l' TGGTGTGGAT TAGTATATCA GTTGATTTGT GTGAATTGTG GTGAAACAAT 2160 

CATTTCATTT TGAAAAGCAA GTAATGAAAA TGTCAGCATC ATAGGAATTA ATAAAATGTT 2220 
TTTACTAAAA AAAAAAAAAA AAA 

SEQ ID NO:46 PCQ8 Protein sequence 
Protein Accession*: BAB15543 

1 11 21 31 41 51 

I I I I I I 

MDVKKGVSWT TIRYMIGBIQ YGGRVTDDYD KRLLNTFAKV WPSENMFGPD PSFYQGVNIP 60 

KCSTVUNYLQ YIQSLPAYDS PEVFGLHPNA DITYQSKLAK DVLOTILGIQ PKDTSGGGDE 120 

TREAWARLA DDMLEKLPPD WPPEVKERL QKKGPFQPKN IPLRQEIDRM QRVLSLVRST 180 

LTEUCLAIDG TIIMSENLQD ALDCMFDARI PAKWKKASWV FSTLGPWFTB LIEHNSQPTS 240 

WVFHGRPHCF WMTGFFMPQG FLTAMRQEIT RANKGWALDN MVLCNEVTKW HKDDISTPPT 300 

EGVYVYGLYL EGAGWDKHNM KLIESKPKVL PELMFVIRIY AENNTLRDPR PYSCPIttCXP 360 
VRTDLNYIAA VDLRTAQTPE HWVLRGVALL CDVK 



SEQ ID NO-.47 P0G5 DNA SEQUENCE 

Nucleic AdslAccesdon*: AB033O36 

Coifing sequence 68-3349 (undefined sequences correspond lo start and slop codmtsj 

1 11 21 31 41 51 

I I I I I I 

GGAGCAGCCT ACAACTTCAC AACCAGAAAC CACTACCCCT CAGGGGTTGC TTTCAGATAA 60 

AGATGACATG GGAAGGAGAA ATGCTGGCAT AGATTTCGGA TCCAGAAAAG CATCAGCAGC 120 

ACAGCCCATA CCTGAAAACA TGGACAATTC CATGGTTAGT GATCCACAAC CATACCATGA . 180 

AGATGCAGCT 1CTGGAGCTG AGAAGACAGA AGCCAGAGCT TCTCTCTCAC TGATGGTGGA 240 

AAGCCTTTCT ACAACCCAAG AGGAGGCCAT TCTCTCAGTA GCAGCAGAGG CTCAGGTGTT 300 

TATGAATCCT TCTCATATCC AGTTAGAAGA TCAAGAAGCT TTCAGCTTTG ATTTACAAAA 360 

GGCCCAATCC AAAATGGAGT CAGCCCAGGA TGTTCAAACT ATCTGCAAAQ AAAAGCCTTC 420 

TGGAAATGTT CACCAGACCT TTACAGCAAG TGTTTTGGGT ATGACAAGTA CTACAGCCAA 480 

AGGAGATGTT TATGCCAAGA CTCTGCCT C C CAGAAGCCTT TTTCAGTOCT CAAGGAAGCC 540 

TGATGCTGAA GAAGTCTCCT CAGATTCAGA GAATATTCCT GAGGAGGGGG ATGGTTCTGA 600 

AGAACTGGCT CATGGTCACT CTTCCCAGTC CTTGGGGAAG TTTGAAGATG AACAAGAAGT 660 

CTTCTCAGAA TCAAAAAGTT TTGTTGAGGA CTTGAGCAGC TCTGAGGAGG AGCTGGACCT 720 

CAGATGOCTC TCCCAGGCTT TAGAGGAGCC TGAAGATGCA GAAGTCTTCA CAGAATCAAG 780 

CAGTTATGTT GAAAAGTACA ACACTTCTGA TGATTGCAGC AGCTCAGAGG AAGACCTGCC 840 

TCTCAGACAC CCTGCTCAGG CCTTGGGAAA GCCCAAAAAC CAACAAGAAG TCTCCTCTGC 900 

TTCAAATAAT ACTCCTGAAG AGCAGAATGA TTTTATGCAG CAGCTGCCTT CCAGATGCCC 960 

TTCTCAGCCC ATTATGAATC CTACTGTPCA GCAACAAGTC COCACCAGTT CAGTGGGCAC 1020 

TTCTATAAAA CAGAGCGATT CCGTGGAGCC AATCCCTCCA AGACACCCTT TCCAGCCATG 1080 

GGTGAACCCT AAAGTGGAGC AAGAAGTTTC CTCATCTCCA AAGAGCATGG CTGTTGAAGA 1140 

GAGCATTTCT ATCAAGCCTC TGCCTCCTAA ACTTCTTTGC CAGCCCTTGA TGAATCCTAA 1200 

AGTTCAACAA AACATGTTCT CAGGTTCAGA GGACATTGCT GTTGAGAGAG TCATTTCTGT 1260 

GGAGCCACTA CTCCCCAGAT ATTCTGCTCA GTCCTTGACA GATCCTCAAA TCOGGCAAAT 1320 

CTCAGAAAGC ACAGCTGTTG AGGAAGGCAC TTATGTGGAA CCGCTGCCTC CCAGATGCCT 1380 

TTCCCAGCCC TCGGAGAGGC CTAAGTTCCT GGACTCAATG AGTACTTCTG CAGAATGGAG 1440 

CAGTCCTGTG GCACCAACAC CTTCCAAATA CS»CTTCCCCG CCATGGGTGA CCCCTAAATT 1500 

TGAGGAACTG TATCAACTCT CTGCACATCC AGAAAGCACT ACTGTTGAAG AGGACATTTC 1560 

TAAGGAGCAG CTGCTTCCCA GACATCTFTC CCAGTTGACT GTGGGAAATA AAGTCCAGCA 1620 

ACTGTCCTCA AATTTOGAGC GGGCTGCTAT TGAGGCAGAC ATTTCTGGGA GTCCATTGCC 1680 
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TCOCCAATAT GCTAGCCAGT TCTTAAAGAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAGAAAATG GCTGTTGAAG GCACTTCTAA CAAATCACOG ATTCCCAGGC GTOOGACOCA 1800 

GTCATTCGTG AAATTTATGG CACAQCAAAT CTTTTCAGAO AGCTCTGCTC TTAAOAGGGG 1860 

CAGTGATGTG GCACCTCTGC CTOCCAATCT TCCTTCCAAA TCTTTATCAA A6CCTGAAGT 1920 

CMGCACCAA GTTTTCTCAO ATTCAGGGAG TGCTAATCCT AAGGGAGGCA TTTCTTCAAA 1980 

GATGCTACCT ATGAAGCACC CTTTACAGTC CTTGGGGAGG CCTGAAGACC CACAGAAAGT 2040 

TTTCTCTTAT TCAOASAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAGAGC AGCT6TCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TGAGGAAACC TGAGTATGAQ CAAAAAGTCT aMCr CW C 2160 

TQCCAGTTCT CCTAAAGAGT GGAGGAATTC TAAAAAGCAS CTGCCTCCCA AACATTCTTC 2220 

CCAAGCCTCA GATAGOTCTA AATTCCAGCC ACACATGTCA TCAAAGGGCC CAGTGAATGT 2280 

ACCTGTAAAS CAGAGCAGCG GTCAGAAGCA CCTGCCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTGCTGC TAGGOGAICT GTTTTTGAGA GCAATTCTGA 2400 

CAATTOOTTC CTAGGAAGAG ATGAAGCTTT 7GCAATCAAA AOCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC CCCATAAAGA GCATTOCAGC CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

TGCTCCTGTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTQA 2580 

GAGTGGTGAT GGTAATAAXA ACCAGCATGC AAACCTATCC AATCAGGATG ATGTTGAAAA 2640 

GdTTTTGGA GTTCGACTGA AAAGAQCCOC TCCCTCGCAG AAGTATAAGA GTCAGAAACA 2700 

AGATAACTTC AOCCAGCTTG CTTCAGTGCC CICGGGCCCA ATTTCATCCT CTOTAGGCAG 2760 

GGQACATAAA ATCAGAAGCA CTTCCCAGGO GCTCCTGOAT GCTGCAGGGA ACCTCAOCAA 2820 

AAXA1CTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGOCG C TT TGGATAACTA TGGCAAAGCA OAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC 9GGAGCCGAS GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCTG CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

OCAXAAACAG GAGAAGACAG CACAGATGAA GCCACCTAAG CCTACAAAAT CAGTTGGATT 3180 

TGAAGCTCAG AAGATACTGC AACTTOCT G C CATGGAAAAA GAAACCAAAC GASCCTTCAAC 3240 

TCTCCCAGCC AAGTTOCAGA ACCCAGTTGA GCCAATTGAG CCTGTCTGGT TCTCACTGGC 3300 

CAGQAAGAAA GCCAAAGCAT GOAGCCACAT GGCAGAAATC ACGCAATAAA GAGCTCTTGT 3360 

GTGGAGCATC AGCATTTATT TTATTTAGTT fTTTTTTTTTT TTTTTTTTTT GAGACAGAGT 3420 

CTCGCTCTCT TACCCAGATT GGAGTGCAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

CTOCCGGGTT CACGCCACTC TCCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

CGCCATCACG CCCGGCTAAT TTTGTTTTCG IATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

TTGGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCGC CCGCCTCAGC CTCCCAAAAjG 3660 

CTGGGATTAC AGGOGTGAGC CACCOCGCCC GGCCAAGCAT CAGCGTTTTA AA30ATAATT 3720 

GCTAATAGCT GTATTAATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

AAATAAGTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

AGCCTTAGAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGCC ATGAATGCTT 3900 

AACCTTCCAC GTAGTCACTG CCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGCACTGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACIAGAAA ATTGTCATTT CCCTCAAOCA 4080 

AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGCCT AGAXAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGQATC 4200 

CTAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC CCTTGAAACA TCTTCCCTGT GTCCCTGGGG GCCCCTGATG 4320 

CmtmYl 1 GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

ACOOCACGTC GAOCTGTCAC AGG0CTGGCT GTAGCGAGCA CCTCCCTATG ACGCAGAATG 4440 

CTTCTTGGGA AMATCTTAC TCCTCTGGAG GGTTAGTCCA TCAATGTTTT GCTTCTTGTC 4500 

CCAAIACTAC TGTGACCCTC TCTGATCGCA CAGAAATCAC TGCCTATCAC ATATATCCTG 4560 

TTAAGCACTG AAGACCCTAT TGAAATTAGA GTTCTACAGA TGCCAAAAGC TGTACTTTCC 4620 

ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGCCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT CCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TQTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACA6AGAA 4800 

ATGAACTTTC CGCTACCTGG AAGCTTTAAC TGAGTAAATC AGCTTTTCCC CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTXACTAGGC TGGASAGACC CTACCTTCCA GTGACCCACT 4920 

CATOCCCCAG CCACGGASAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTGAGGG GAGGGGGCAA GTGGCCCAGC AAATGTTGAT GCCTCCCTTC 5040 

OCATCTTGCC ACACGGTCTT TTTCTTTT GT AGCACAGOCT CCAXTAATAA CTCCTCGGCT 5100 

GAGGATOAAG ATGTAGGCAC CTT?ACCCCC AGAGCCAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA OCAOCCTAQA AICTCMCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCTCTGTCTT TCTCTTCATT OCATCOCCCA AACCCACCAA ACACTAAGGG AGAGCTCCCT 5280 

TTGOATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACCCAGAAG TGACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTT A CA CTGTGCTGCT TTGCTTAAAC AOAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGGT GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AJGTGAGGTGT CAGAGCTAAA CA CTTGGTG C TGGGTTTTGT TGATGCTGGT 5520 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT S5B0 

TTTCCTGATA CTGTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT GAATTTTATT 5640 

TCCTTTQTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTFATC TGGGTACTCC 5700 

TGGTAGATTA GCTGTTACAC CTCOCTTCCC TTTTTTACAG TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTCCA ATAACAATTT CTTTTCCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCOTC CTTTTTTICA CAGTGAACCT QTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACACTTAACA ACAAAGTTCT GTTTTTAAAT GAAGAGATTA 5940 

ACTTCTTITP AAATGCCTAA AGGCAXATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

sra in tin-i« PfiGS Protein senuence 
Protein Accession*: BAA8S524 

1 11 21 31 41 51 

I I I I I I 

EQPTTSQPET TTPOGLLSDK DDHQRRNAGI DFGSRKASAA QPIPENHDHS HVSDPQPYHE 60 
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DAASGAEKTE ARASLSLMVE SLSTTQERAI LSVAABAQVF HNPSHIQLED QBAFSFDLQK 120 

AQSKHESAQD VQTICKEKPS GNVHQTPTAS VLGHTSTTAK GDVYAKTLPP RSLFQSSRKP 180 

DAEEVSSDSE HIPEBGDGSB BLAHGBS5QS LGKFBDEQEV FSESKSFVED tiSSSKRRT.nL 240 

RCLSQALEEP BDAEVFTBSS SYVEKYHTSD DCSSSEBDLP LRHFAQALGK PKNQQEVSSA 300 

SNNTPEEQND PMQQLPSRCP SQPIMNPTVQ QQVPTSSVGT SIKQSBSVEP IPPRHPFQPW 360 

VNPKVEQEVS SSPKSMAVEE SXSKKPIiPPK LLCQPLMNPK VQQNMPSGSE DIAVERVISV 420 

BPLLPRYSPQ SLTDPQIRQI SBSTAVEEQT YVBPLPPRCL SQPSERPKFL DSMSTSABWS 480 

SPVAPTPSKY TSPPWVTFKF EELYQLSAHP BSTTVBEDIS KEQLLPRHLS QMVGKKVQQ 540 

LSSNFERAAI EADISGSPLP PQYATQPLKR SKVQEHTSRL EKHAVEGTSN KSPIPRRPTQ 600 

EFVKPHAQQI PSESSALKRO S0VAPLPFNL PSKSLSKFEV KH0VFSDSOS ANPKGGISSK 660 

MLPHKHPLQS LGRPEOFQKV FSYSERAPGK CSSFKEQLSP KQLSQALKKP BIEQKVSPVS 720 

ASSPKEWRNS KXQLPPKHSS QASBRSKFQP QKSSKGPVNV FVKQSSGEKH LPSSSPFQQQ 780 

VHSSSVNAAA RRSVFESNSD NWFLGRDEAF AXKTKKFSQG SKKFIKSIFA PATKPGKFTI 840 

APVRQTSTSG GIYSKKEDLE SGDGNNNQHA HLSNQDDVEK LPGVRLKHAP PSQKYKSBKO. 900 

UNFTQLASVP SGPISSSVGR GHXIRSTSQQ LLSAAGNLTK ISYVADKQQS RPKSESKAKK 960 

QPACKTPGKP AGQQSDVAVS RPVWTIMAKQ KQKSFKAHIS VKELKTKSNA GADABTKEFK 1020 

YEGAGSANEN QPKKMPTSSV HKQEKTAQMK PFKPTKSVGF EAQKILQVPA HEKETKRSST 1080 
LPAKPQNPVB PIEPVWPSIA RKKAKAWSHH ABZTQ 

SE0 D NMS PAB7 DNA SEQUENCE 

Kudeb Add Accession f: D87742 

Codngsequence: 208-3582 {underlined sequences correspond to start and slop colons) 

1 11 21 31 41 51 

I I I I I I 

GCTTTCCTTT CTAAAGTAGA AGAGGAT6AT TATCCCTCTG AAGAACTACT AGAGGATGAA 60 

AACGCTATAA ATGCAAAACG QTCTAAAGAA AAAAAGCCTG OGAATCAGGG CAGGCAGTTT 120 

GATGTTAATC TGCAAGTCCC TGACAGAGCA GTTTTAGGGA CCATTCATCC AGATCCAGAA 180 

ATTGAAGAAA GCAAGCAAOA AACTAGTATO ATTTTGGATA GTGAAAAAAC AAGTGAOACT 240 

GCTGCCAAAQ -GGGTCAACAC AGGAGGCAGG GAACCAAAXA CAATGGTGGA AAAAGAACGC 300 

CCTCTGGCAG ATAAGAAAGC ACAGAGACCA TTTGAACGAA GTGACTTTTC TGACAGCATA 360 

AAAATTCAGA CTCCAGAATT AGGTGAAGTG TTTCAGAATA AAGATTCTGA TTATCTGAAG 420 

AACGACAACC CTGAGGAACA TCTGAAGACC TCAGGGCTTG CAGGGGAGCC TGAGGGAGAA 480 

CTCTCAAAAG AGGACCATGG GAACACAGAG AAGTACATGG GCACAGAAAG CCAGGGGTCT 540 

GCTGCTGCAQ AACCTGAAGA 1GACTCQTTC CACTGGACTC CACATACAAG TGTAGAGCCA 600 

GGGCATAGTG ACAAGAGGGA GGACTTACTT ATCATAAGCA GCTTCTTTAA AGAACAACAG 660 

TCTTTGCAGC GGTTCCAGAA GTACTTTAAT GTCCATGAGC TGGAAGCCTT GCTACAAGAA 720 

ATGTCATCAA AACTGAAGTC AGCGCAGCAG GAGAGOCTGC CCTATAATAT GGAAAAAGTC 780 

CTAGATAAGG TCTTCCGTGC TTCTGAGTCA CAAATTCTGA GCATAGCAGA AAAAATGCTT 840 

GATACTCGTG TGGCTGAAAA TASASATCTG GGAATGAACG AAAATAACAT ATTTGAAGAG 900 

GCTGCAGTGC TTGATGACAT TCAAGACCTC ATCTATTTTG TCAGGTACAA GCACTCCACA 960 

GCAGAGGAGA CAGCCACACT GGTGATGGCA CCAOCTCTAG AGGAAGGCTT GGGTGGAGCA 1020 

ATGGAAGAGA TGCAACCACT GCATGAAGAT AATTTCTCAC GAGAGAAGAC AGCAGAACTT 1080 

AATGTCCAGG TTOCTGAAGA ACCCAOCCAC TTGGACCAAC GTGTGATTGG GGACACTCAT 1140 

GOCTCAGAAG TGTCACAGAA GCCAAATACT GAGAAAGACC TGGAOCCAGG GCCAGTTACA 1200 

ACAGAAGACA CTCCTATGGA TGCTATTGAT GCAAACAAGC AACCAGAGAC AGOCGCCGAA 1260 

GAGCCGGCAA GTGTCACACC TWGGAAAAC GCAATOCTTC TAATATATTC ATTCATGTTT 1320 

TATTTAACTA AGTCGCTAGT TGCTACATTG CCTGATGATG TTCAGCCTGG GCCTGATTTT 1380 

SAIGGACTGC CATGGAAACC TGTATTEATC ACTGCCTTCT TGGGAATTGC TTCGTTTGCC 1440 

ATTTTCTTAT GGAGAACTGT CCTTGTTGTG AAGGATAGAG TATATCAAGT CAOGGAACAG 1500 

CAAATTTCTG 'AGAAGTTGAA GACTATCATQ AAAGAAAATA CAGAACTTGT ACAAAAATTG 1560 

TCAAATTATG AACAGAAGAT CAAGGAATCA AAGAAACATG TTCAGGAAAC CAGGAAACAA 1620 

AATATGATTC TCTCTGATGA AGCAATTAAA TATAAGGATA AAATCAAGAC ACTTGAAAAA 1680 

AATCAGGAAA TTCTGGATGA CACAGCTAAA AAT CT TC GT G TTATGCTAGA ATCTGAGAGA 1740 

GAACAGAATG TCAAGAATCA GGACTTGATA TCAGAAAACA AGAAATCTAT AGAGAAGTTA 1800 

AAGGATGTTA TTTCAATGAA TGCCTCAGAA TTTTCAGAGG TTCAGATTGC ACTTAATGAA 1860 

GCTAAGCTTA GTGAAGAGAA GGTGAAGTCT GAATGCCATC GGGTTCAAGA AGAAAATGCT 1920 

AGGCTTAAGA AGAAAAAAGA GCAGTTGCAG CAGGAAATCG AAGACTGGAG TAAATTACAT 1980 

GCTGAGCTCA GTGAGCAAAT CAAATCATTT GAGAAGTCTC AGAAAGATTT GGAAGTAGCT 2040 

CTTACTCACA AGGATGATAA TATTAATGCT TTGACTAACT GCATTACACA GTTGAATCTG 2100 

TTAGAGTGTG AATCTGAATC TGAGGGTCAA AATAAAGGTG GAAATGATTC AGATGAATTA 2160 

GCAAATGGAG •AAGTGGGAGG TGACCGGAAT GASAAGATCA AAAATCAAAT TAAGCAGATG 2220 

ATGGATGICT CTCGGACACA GACTGCAATA TCGGTAGTTG AAGAGGATCT AAAGCTTTTA 2280 

CAGCTTAAGC TAAGAGCCTC CGTGTCCACT AAATGTAACC TGGAAGACCA GGTAAAGAAA 2340 

TTGGAAGATG ACCGCAACTC ACTACAAGCT GCCAAAGCTG GACTGGAAGA TGAATGCAAA 2400 

ACCTTGAGGC AOAAAGTQGA GATTCTGAAT QAGCTCTATC AGCAGAAGGA GATGGCTTTC 2460 

CAAAAGAAAC TGAGTCAAGA AGAGTATGAA GGGCAAGAAA GAGAGCACAG GCTCTCAGCT 2520 

GCAGATGAAA AGGCAGTTTC GGCTGCAGAG GAAGTAAAAA CTTACAAGCG GAGAATTGAA 2580 

GAAATGGAGG ATGAATTACA CAAGACAGAG CGOTCATTTA AAAACCAGAT CGCTACCCAT 2640 

GAGAAGAAAG CTCATGAAAA CTGGCTCAAA GCTCGTGCTO CAGAAAGAGC TATAGCTGAA 2700 

GAGAAAAGGG AAGCTGCCAA TTTGAGACAC AAATTATTAG AATTAACACA AAAGATGGCA 2760 

ATGCTGCAAG AAGAACCTGT GATTGTAAAA CCAATGCCAG GAAAACCAAA TACACAAAAC 2820 

CCTCCACGGA GAGGTCCTCT GAGCCAGAAT GGCTCTTTTG GCCCATCCCC TGTGAGTGGT 2880 

GGAGAATGCT CCCCTCCATT GACAGTGGAG CCACCCGTGA GACCTCTCTC TGCTACTCTC 2940 

AATCGAAGAG ATATGOCTAG AAGTGAATTT GGATCAGTGG AOGGGCCTCT ACCTCATCCT 3000 

CGATGGTCAG CTGAGGCATC TGGGAAAOCC TCTCCT T CTG ATCCAGGATC TGGTACAGCT 3060 

ACCATGATGA ACAGCAGCTC AAGAGGCTCT TCCCCTACCA GGGTACTCGA TGAAGGCAAG 3120 

GTTAATATGG CTOCAAAAGG GCCCOCTCCT TTCCCAGGAG TCCCTCTCAT GAGCACCOCC 3180 

ATGGGAGGCC CTGTACCACC ACCCATTCGA TATGGACCAC CACCTCAGCT CTGCGGACCT 3240 

TTTGGGCCTC GGCCACTTCC TCCACCCTTT GGCOCTGGTA TGCGTOCACC ACTAGGCTTA 3300 
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10 
15 
20 
25 
30 
35 
40 
45 



AGACJAATTTO 
TTTTTAOCTG 
ATTCCTGCTA 
GCTGTAAGAG 
ACTAGCCAGG 
TTCATTGGAA 
AAAAXOCAAA 
CATTTTTGAJ3 
AGCTAGAGCG 
GTAGCATATG 
GAAATGCTTT 
GGAGCAATGG 
AAAATGTTTA 
TAGCTCATAA 
ATGAGGCTTG 
CTGGTGGCAC 
TATTTCAAAG 
ATTGTCTATT 
AGCTGATGTT 
AAATGAXGTG 
CCTTATCTAT 
AAGAGTATAA 
GTCAGCAACC 
ATTATTCCAA 
TAACTAACCA 
AAACAATGTT 
TACACTAATA 
AAAGGCTGAT 
TGTAATATTT 
AATACCTTGT 
ACAACTGAAG 
ACTTCATAAG 
TTTAATATCA 
TEAGCCATGT 
CTCTTTAGGA 
AGTGTAGATT 
ATTCAAAATA 
GACATAATTG 
GAGCCAGTCC 
CAAAGCAGGG 
TTOCATCTCT 
AGTTGCTAAA 
GOCATAGTTG 
AGTAATTCGT 
TTATATTCAG 



CACCAGGCGT 
GACACGCACC 
CCCGATTACC 
ACTTACTGCC 
ACTGTTCACA 
AGAAAGTGTA 
AGTTTATTTT 
CCAAACAATT 
TOCTTACAAC 
TAATTGCAAA 
AAGAACATGT 
TGTTTATAAG 
CTAAAAGATC 
ARATTTGTTT 
TGCCATTTGG 
ACTTCCCOCT 
AAGTTTATTT 
TOAGAATGGT 
OCATTCTTTT 
TCATGGCCAT 
CTTTCCCATT 
TGCCATGAGA 
AAGGGTTGAA 
AATTAATATT 
TCTGGAATTG 
TCTTTAAATA 
ATAGCACTCC 
ACTTTTGTTT 
TTGAAACCTA 
AAAAAGGAGC 
ATAGATAGTT 
GAATATAAAA 
AGAATAGAAG 
AAAAATAAGA 
CACAAAACAA 
ATGCCATCTA 
TTAGAGTATT 
AGAAACTGGT 
ATAACTGCTT 
TGCCAATATG 
AAAGTTTCAT 
ATTGTCTTAT 
TTGTAGTTAT 
GGGATGTGGT 
GTCTGAATTA 



TCCACCAGGA 
ATTCftGAOCT 
AOCCOCAACC 
OTCAGGCTCT 
GGCTTTAAAA 
CTGTGCATTA 
AAAAGGTTTG 
CAAAAATGTC 
TTTGAAATGT 
ATGATTTAGA 
ATTTCCATTA 
CGTTTTTTTA 
ACTAAACTAT 
ATTAATATTT 
GGAACATGTA 
GCTOCTCCGT 
OCCACTTGTA 
TTTCTGAGAG 
TACCATTCCT 
AAAAGTATAG 
OCTTGCCACT 
AAGAATGATT 
ATCAGTTCTG 
AATTAATATT 
CACCATACTT 
CTCTACAACG 
TTTTAAGGAG 
GCTGCTAGGC 
GTGTATGTCT 
AAAAGCTTCA 
TAGAAAGATA 
ATTCTTCAGG 
AAATTAftGAG 
TTAAGTCACA 
TGCTGAAGTT 
GGAAGGTAAG 
TTTCCCCTCT 
AAGCTGTAAA 
CCTCACATCC 
CAGATGGCAT 
CTATTTTGGA 
TTATTTATGA 
ATCGCCAATG 
ATATTCTGTG 
AAGTTAAGTT 



ACACGGGACC 
TTAGGTTCAC 
CATQGTCCCC 
AGAQATGAGC 
CAGAGCOCAT. 
TCCATTACAG 
TTGTTAGAAC 
ATTTCTTCCC 
GCAATAAAGA 
ATGTCATCAA 
TCCTATTTTT 
AACTATCTGG 
CTCCCCTCTT 
CCCAACTGTC 
AACTCAGGCT 
CACCIGTGAA 
TAGCATTCAC 
TGAGTTTACA 
GTAGAAAAAG 
AAATCTTTAA 
GATTTTTGAG 
TAGGACTSTG 
TTTTAGGGGG 
TAAACGTTGO 
AAAGTCTTAT 
TTTCTAAGAA 
TTTCWSATOC 
TATATTCTTC 
TGTCACTGTT 
ATGTGAAACA 
AGGAOCTTTG 
AAAAGAGAAT 
GAAAACTCCA 
AATACAACTT 
AATATAATTT 
TAGGAAAGGT 
AAAGCCTTTT 
GATTCCAGTG 
ATCTGATTGC 
AGGGAGTATC 
AGTCATCTCC 
AGCAGCAATA 
GCTGATTTTT 
TCAACTTCAA 
AATCAC 



TGCCTCTCCA 
TTGGCCCAAG 
AGGAATACCC 
CTCCACCTGC 
AAAACTATGA 
TAAAGGATTT 
TAAGCTGCCT 
TAAATAAAAA 
ATACCTGTGT 
AAATATGAAC 
AGTGTACACC 
TCACARAGAC 
GCTGAAGTTC 
TGTTGACTCA 
CCCAGAACTG 
CTCTACAAGT 
ATGCTTTCTT 
TTAGTAGCAA 
GGTGCACAAC 
AAATTTTAAA 
GAATATAATA 
AGGGTTATAA 
AAATGGGGGG 
TGTTTTTATT 
CCATTACTAC 
CGAACTTCAG 
ACACTAAAAC 
CATTCTTTGA 
GTGATATTTA 
ATTTTCTCTC 
AAAGAASACA 
TCAATCTATA 
CAGAAGAGCA 
TTGAATTTAC 
CTAATTTTAA 
AAATTAAATC 
TTGGTGATTA 
TAGCTTCTCT 
ACCATTTCTG 
ATCCCTCAGC 
AACTAASTGT 
TTCAGCCTGA 
TTCATTGGAA 
GATAATCACT 



CCCTCG GOGA 
AGAGTACTTT 
ACCACCACCT 
CTCTCAGAGC 
CCTCTGAGGT 
CATTGGCTTC 
TGGCAGTGTG 
TCACCTTTTA 
TTTAGCTAAT 
ATTTCCTGTG 
AGCTGAATAC 
TGTTACGCTA 
TTTGTAGTAA 
TTGGACTGTT 
AAGATGGTGG 
GATGTCTTTT 
TACGATCCTC 
GAQTTGTTTG 
AGAAAAATGA 
ATGTACAGTC 
AAAAQATTGG 
CATGCCCTAG 
GGCGACAGAT 
TAAAAATCAG 
ACTGTCTTTA 
ACATTTTAAT 
TAAAATCATA 
AGTCCTATGA 
ATCGATTAAG 
TTTATACTAA 
ACTCTGTCAA 

TAGGCCACTT 
CTGTCAATAT 
ATGTCATTTA 
TATTTTTAAA 
TTCTGTATCT 
GAGAAGTTGT 
CAGCAAACCC 
CAAATCACTT 
GTCTGGATTT 
AAGCATTTCT 
AGTAAATTTA 
CATTTTCTCG 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
S040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
S880 
5940 



50 
55 
60 
65 
70 
75 
80 



SEP. ID NO-.50 PAB7 Protein sequence 
Protein Accession* BAA13448 



1 
I 

AFLSKVEEDD 
XBESKQETSH 
KIQTPELGEV 
AAAEPEDDSF 
HSSKLKSAQQ 
AAVLDDIQOL 
NVOVPBEPTH 
EPASVTPLEN 
IFLWRTVLW 
NMILSDEA1K 
KDVISMNASE 
AELSEQIKSP 
ANGEVGGDRH 
LEDDRHSLQA 
ADEKAVSAAE 
EKREAANLRH 
GECSPPLTVE 
THMNSSSRGS 
FGPRPLPPPP 
IPGTO1PPPT 



11 
I 



ILDSEKTSET 
FCNKDSDYLK 
HWTPHTSVEP 
ESLPYKMSKV 
IYFVRYKHST 
LDQRVIGDTH 
AZLLIYSFHF 
KDRVYQVTEQ 
YKDKIKTLEK 
FSEVQIALNE 
EKSQKDLBVA 
EKMKNQIKQM 
AKAGLKDECK 
EVKTYKRRIE 
KLLELTQKMA 
PPVRPLSATL 
SPTRVLDEGK 
GPGHRPPLGL 
HGPQEYPPPP 



21 
I 

NAINAKRSKE 
AAKGVNTGGR 
NDNPEEHLKT 
GKSDKBEDZiL 
LDKVFRASES 
AEETATLVHA 
ASEVSQKPOT 
YLTKSLVATL 
QISEKLKTIH 
NQEILDDTAK 
AKLSEEKVKS 
LTHKDDN1NA 
HDVSRTQTAI 
TLRQKVEILN 
EHEDELQKTB 
KLQEBWD/K 
KRRDMPRSBP 
VNMAPKGPPP 
REFAPGVPPG 
AVRDLLPSGS 



31 
I 

KNPGNQGRQP 
EPNTKVEKER 
SGIAGEPEGE 
IISSFFKEQQ 
QILSIAEKML 
PPLEEGLGGA 
EKDLDPGPVT 
PDDVQPGPDF 
KENTELVQKL 
NLRVKLBSER 
ECHRVQEENA 
LTNCITQUJL 
SWEEDLKLL 
ELYQQKEMAL 
RSPKKQIATH 
PMPGKPNTQN 
GSVDGPLPHP 
FPGVPLMSTP 
RRDLPLHPRG 
RDEPPPASQS 



41 
I 

DVNLQVPDRA 
PLADKKAQRP 
LSKEDHGNTE 
SMJRPQKYFN 
DTRVAEHRDL 
MBEMQPLHED 
TEDTPMDAID 
YGLPWKPVFI 
SNYEQ KIKES 
EQNVKNQDLI 
RLKKKKEQLQ 
LECESBSEGQ 
QLKLRASVST 
QKKLSQEEYE 
EKKAHEHWLK 
PPBRGPLSQN 
RWSAEASGKP 
MGGPVPPPIR 
FLPGHAPFRP 
TSQDCSQALK 



51 
I 

VLGTIHPDPE 
PEKSDFSDSI 
KYKGTBSQGS 
VHELEALLQE 
GHNEKNIFBE 
NFSREKTABL 
ANKQPETAAE 
TAPLGIASFA 
KKHVQETBKQ 
SENKKSIEKL 
QEIBDWSKLH 
NKGGNDSDBL 
RCNLEOOVKK 
RQEREHRLSA 
ARAAEHAIAE 
GSFGPSPVSG 
SPSDPGSGTA 
YGPPPQLCGP 
U3SLGPHEVF 
QSP 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



SEO 10 N031 PAB9 DNA SEQUENCE 

Nucleic Add Accession* NMJ0O6457 

Oodng sequence &H874 {mitoimed sequences corospond lo start and skip codore) 



1 11 21 31 41 51 

I I I I I J 

AGACTGAGGC GGAGGCAGCC CCGCGCXWCG CCGGACCCGA GCATATTTCA TTTTCTGTCA 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTICA 
CTGTGTCCAA 
CTlC TO iWU 
CCCATGCGAC 
TGTTCGCTGC 
CACTGAGC6C 
OTTCCGAGAC 
AACAGCAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCfCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
7GGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAJSCTCTGGG 
TGGAAGGTCA 
CTGTGAATTT 
AAATTAAAAT 
AGTGGCCCTG 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAATTA 
TTAAACAGAG 
GCGCGGTGGC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCAGCC 
TATTTTTGCC 
T6TGTCATGC 
6CTACCATAT 
TTCATTGTGT 
GTAAAGATTT 
AATAGAGGGC 
GGGCGGATCA 
CTACTAAAAA 
GGGAGOCTGA 
CACACCACT6 



AGCCATTAGA 
CCGGCTGCAG 
CGGCAAGGCA 
AAATGCACAA. 
TTTGAATATG 
AAAGGGAGAA 
ACTCACTTCC 
TTCACCAAAA 
CAOCTCATCA 
ATCTGGACTG 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAQTQATGCC 
AACAACTCAG 
AGAATCFGAA 
GGCTTCCTTQ 
CAGACCAGQG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAST 
AACTO CQATQ 
GAAATCTTGG 
TGQATTTGTA 
CCCTGAATGT 
AACTTGGCAT 
TTTTCACTTG 
TATATGOCAT 
CTACAOCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAAXAAA 
AGAGACGSTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGOACGCA 
CACTTGAACC 
TGGGTGACAG 
TTACAGTGQA 
CACTAAGAGA 
AGCTTATAAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



AOCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCA6TTAATO 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 
TCTCGCTCTT 
GCCGATAATA 
OTAGCTTCCA 
GTTAOCAGCC 
TCACCAAGCT 
GCTACTTACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 
GAGGAGAAAG 
GGTCGATGCC 
GTTTCCTGTT 
GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA. 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 
GGCTCACGCC 
AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 
TTGTGTGTGT 
AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCOCCT 
AGAAGAGAAG 
CAATATTTAT 
AAAAACCAAG 
TATTACTTTT 
TGGACTATTA 
CCAATCTGAA 
GGCAATGAAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAAATTCC 
TCATAGTTTT 
TGTGATCCCA 
CATCCTGGCC 

CAGGAGACGG 
GCAAGACTCC 



GTCACTGGTT 
GCCTCTGACA 
AGGOGATGTG 
CCAGAAXAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GGATACTGAA 
TGOCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGOA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGmT 
TCTGAGGAAA 
TCCTCTATPP 
AATTCATC7T 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCATCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACTTT 
AATAAGATTT 
TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 
GCACTTTGGG 
AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGOCGG 
ACATCTOCTG 
CGGCCTTTTG 
TTCACCCCAG 
ACTCCTOCCC 
TCTCCATCTG 
ACCAGCGTQT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTACCIYCCA 
AACTCAGCTT 
CACAVTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGOCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAQ 
CGTACCACTG 
GCTTGTATAT 
TTTATCAAAA 
CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 
GAGCTGAGAT 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



SEP ID NO:52 PAB9 Protein seouence 
Protein Accession* NP.006448 



1 
I 

1 KSNYSVSLVG 
61 KTHLEACHKI 
121 NNHAYHKAPH 
181 ANAHLSADQS 
241 KHIVERYTEP 
301 DHTKKANNSQ 
361 PSWQRPNQGV 
421 ABCNQVIRGP 
481 RCQHKILGEV 
541 CEPPIEAGDM 



11 
I 

PAPWGFRLQG 
KGCTGSLNMT 
PPGSVSSPKV 
PSALSAGKTA 
YHVPTHSOAS 
EPSPQLASLV 
PSTGRISNSA 
PLVALGKSWH 
INALKQTWKV 
FLBALGYTWH 



21 
I 

GKDFNHPLTI 
LQRASAAPRP 
TSIPSPSSAF 
VNVPRQPTVT 
KKRLIEDTED 
ASTRSHPESL 
TYSGSVAPAN 
PEEFNCAHCK 
SCFVCVACGK 
DTCFVCSVCC 



31 
I 

SSLKDGGKAA 
EPVPVQKGEP 
TPAHATTSSB 
SVCSETSQEL 
WRPRTGTTQS 
DSPTSGRPGV 
SALGQTQPSD 
OTMAYIGFVE 
PIRNNVFHLE 
ESLEGQTFFS 



41 
I 

QANVRIGDW 
KEWKPVPIT 
ASPSFVAAVT 
ABGQRRGSQG 
RSFRILAQIT 
TSIiTTAAAFK 
QDTLVQRAEH 
EKGALYCELC 
DGEPYCETDY 
KKDKPLCKKH 



51 
I 

LSIDGINAQG 
SPAVSKVTST 
PPLFAASGLH 
DSKQQNGPPR 
GT6HLKBSEA 
PVGSTGVIKS 
IPAGKRTPKC 
YBKFFAPECG 
YALFGTICBG 
ABSVNP 



SCO ID M0:S3 PBH7 DNA SEQUENCE 

KuddcAdd Accession f: AA431407 

1 -Mi (undeitlned sequences correspond lo start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 



11 



21 



41 



51 



31 

I i I I ! 

ATGGCCAACT GTAAAATGAC CAAAAGCATC AGGTTCCCTG CCCTGGACCA CTGCTATACT 
GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GGGCCTTCTG 
CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGGCACCTA CTGGGGAATG 
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AAGATCAAGC CGGGTTTCAT GGGGAAGGCC ACTCCACCCT ATGACGTCCA C3TTTCATATG 240 

GAGGCCTCAG TTGAAAACTO CATTATTOTG AGCATGAACA CCGCTGACCC TGGCAQCCAG 300 

GGCATCACAC ACAGCCTCTT GCTACAGGTC ATTGATGACA AOGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

TCCTATGAGG GTGACCCAGA GAAGACASCT AAAQTGGAAT GTGGGGACTT CTACAACACT 480 

CGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATITGTT TCCTGGGGAG GAGTGATGAC S40 

ATCATTAATG CCTCTGGGTA TCGCATC6GO CCTGCAGAGG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGTC AGCCGTGGTG GGCAGCCCAG ACCCGATTCG AGGGGAGGTG 660 

GTGAA6GCCT TTATTGTCCT GACCCCACAO TTCCTGTCCC ATQACAAGGA TCAGCTGACC . 720 

AAGGAACTGC AGCAGCATGT CAAGTCAOTQ ACAGCCCCAT ACAAQTACCC AAGGAAGGTG 780 

GAGTTTGTCT CASAGCT6CC AAAAACCATC ACTGGCAAGA 1TGAACGGAA GGAACTTCGG 840 

AAAAAQSAGA CTGGTCAGAT GTAATCGGCA GTGAACTCAG AACGCACTGC ACACCTGAGG 900 

CAAATCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTGGG GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTTC TTTTAAATTA 1020 

AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATOIC AACAA 

SEQroNa54PBH7Protlfrsenuence 
Protein Accession* FGENESH predicted 

1 11 21 31 41 SI 

I I I I I I 

MRNCKKTKSI RFPALEHCYT GGEWLPKDQ EEWKRRTGIX LYENYGQSET GLICATYWGH 60 

KtKPGFKGKA TPFYDVQFHM EASVEHCIIV SHNTADPGSQ GITHSLIXQV IDDKGSILPP 120 

NTEGNIGIHI KPVRPVSLPM CYEGDPEKTA KVECGDFWIT GDRGKHDEEG YICFLGKSDO 180 

IZKASGYRZG PAEVESALVE HPAVAESAW GSPDPIRGEV VKAFIVLTPQ FLSHDKDQLT 240 
KELQQHVKSV TAPYKYPRKV EFVSELPOTI TGKIERKELR KKETQQM 

SEO ID HftSS PBJ5 DNA SEQUENCE 

Nucleic Add Accession «: AF388200 

Coding sequence 35-137 (imderflned sequences correspond to start and atop codons) 

1 11 21 31 41 SI 

I I I I I I 

GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG C GATCT GCTG TGAAATCTAC TACCGTTTGC 60 

TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

TGGAAAAGGG TCACTGAAAT GGGACGAC AT GA ACTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGOAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

TAATACAAAA TAATAGTAAT AATCCCTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAATGTGTA ATTAAGAAAG AACTCTCCAC 420 

CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATGTCAT GTCTTTTAAA 480 
TAAGCCATAC TTTATGTTCA ATAAAAAGAG AATAAGCAGG A 

SEQ ID WO-.56 PBJ5 Protein sequence 
Protein Accession i: AAKS3352 

1 11 21 31 41 SI 

I I I I I I 

KCCEIYYRLL VLKHEKKSEE LRKMDGLGNV BKGH 

SEQ ID H037 PBJ7 DNA SEQUENCE 

Nxleic Add Accession t: AA876910 

Coding sequence: 1-2064 (undertned sequences correspond to start and slop codons) 

1 11 21 31 41 SI 

I I I I I I 

ATGG ACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

GATCTGTTGG AARCTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGOC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCG AGCAGGGAGA ACGAAAAGCT 240 

GTTTCTTTTC CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

GAGCCAGCTC GTACCCATGA AGAGCAACAT AAT1TGCCGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT 600 

AGCTGTAGAG ATACTTACCA GTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 660 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTOC TCATCCTAAA 720 

TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 840 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGG 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

OTTCCTCTGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGCTATGTTT AAAAGCAAAA CCCCCTEATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTGCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTCTGATT AGTACCGGGT ATAACTTATC TGCTTCTCCT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

GCACCCAACA ATACCTGOTT GGOCTGCACC TCAGGTCTCA CTCGCTCCAT TAATGGAACT 1380 

323 



WO 02/30268 



GAACCAGGAC CTCTCCTGTG CGTGTTAGTT CATGTACTTC CCCAGGTATA TGTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCTGAGTTAC ATCCCAGGTT GCACCAAGCT 1500 

OTOOCACTTC TGGTTCCOCT ATTGGCTGST CTTAGCAIAG CTGGATCAGC AGOCATTSGT 1S60 

ACGGCTGCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGTGGATGCT 1620 

GATTTTAGTA ACCTCCAGTC TGCCATAGAT ATACTACATT CCCAGGTAGA OTCTCTGGCT 1680 

QAAGXAGTTC TTCAAAACTG CCGATGCTTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTOCAGCTC TAGGAGAAAG TTGTTGCT TC TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAGTAAAAA aagttcgaga aaatctagat AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGGCTAA CTACTTTAAT CACTGGOTTA 1920 

QCTGGACCTC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGCCTTGTAT ATCAAATTCG 1980 

TTTCTTAATT TTATAAAACA ACGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA CTGA 

SEP ID W058PBJ7 Protein seouaice 
Prole* Accession* FQENESH preceded 

1 11 21 31 41 51 

I I I I I I 

MJSCLQHMRD LUfLLQELSC UWATLLPDP DSTTPVHDCQ DLLBTTKTGQ PDLQDVPLEK 60 

ADATVPTDGS SPLEQGERKA VSFPQPDLPD KPTYSTEEEK LASDVGANKN QEGKVFANTT 120 

KRAGTSKEVS PAVDLCVLFP EPAKTHEEGH KLPVIGAGSV DLAAGFGHSG SQTGCGSSKG 180 

AEKOLQNVDP YLCPGNHPDA SCRDTYQFFC PDWTCVTLAT YSGGSTRSST LSISKVFHPK 240 

LCTRKHCHPL TITVHDPNAA QWYYGMSWGL RLYZFGFDVG TKFTIQKKIL VSWSSPKPIG 300 

PLTDLGOPXP QKHPDKVDLT VPLPPLVPRP QLQQQHLQPS LHSILGGVBH LLNLTQPRLA 360 

QDCWLCLKAK PPYYVGLGVE ATLKRGPLSC HTRPHALTIG DVSSMASCU STGYHLSASP 420 

FQATCNQSLL TSISTSVSYQ APHHTWLACT SGLTRCINGT EPGPLLCVLV HVLPQVW7S 480 

GPEGRQLIAP PELHPRLHQA VPLLVPLLAG LSIAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSA1D ILHSQVESIA EWLQNCRCL DLLFLSQGGL CAALGESCCF YANQSGVIKG 600 

TVKEVRENLD RHQQERENNI PWYQSHFNWN PWLTTLITGL AGPLLIIiLLS LIPGPC1LNS 660 
PLHFIKQRIA SVKLTYLKTQ YDTLVNN 

SEQIDN(h£9 PCQt DNA SEQUENCE 

Nudelc Add Accession!: NM_01900S 

Coring sequence: 1 82-1885 (underiti^ sequent c^esrxmd to s^ and stop codons) 
1 11 21 31 41 51 

I I I I I I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

TGGTGAATGA ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAACTTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGAGCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATGT GGAATCTACT GTGAATTCAG AACTCAAAGC 300 

TGGATCTTTA CGTTTATCTG AAGACTCTGC AGCTACAITA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTG CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TGGACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTTGGT CAAGATCATA ACTCAAAGTT 480 

CAAAGATTTG AXAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC 540 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGCTGGTTTA GAXAAGCACA GAGCTGACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATAXAGTTC CCATGGAAAA 660 

ASTSAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC CACTTTATGA 720 

GTTAGGACAG AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCACGAGAOC AGAAACTTCT 780 

cenxicnsar atgcatcgta acctagctat atttgatctt cggaatacaa gccaaaagax 840 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCATATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACCCTT AACAAAAGTA GCATGGTGTC CCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT OOCATTGGGG ATGAAACTGA ACCCACAATA ATTGAAAGAA QTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT GCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TGTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGGCTG 1260 

GAOCCCAATT ACATCTTTAA TCTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAQA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGAIC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCOCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCCGAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT GTGAAAAAGG CAGGAGATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG G ATGAG AAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGOGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTSC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGG AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA QAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG CTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC OCAGTTCCAA GCCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT CCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGTGAGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT GTOCTGGCTG TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 
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GGAACCAAAT CAGATGAAAA AGTGGACTTG AGCAAGGACA AAAAATTAGC CCAATTTAAC 2640 

AACTGGTTTA CATGGTCTCA TAATTGCACG CACGGTCGAC ATGCTGGACA TATGCTTACT 2700 

TGGTTCASGG ACCATGCAGA GTGCCCTGTG TCTGCATGCA CGTGTAAATG TATGCAGTTG 2760 

GATACAACGG GGAATCTGGT ACCTGCAGAG ACTGTCCAGC CATAAAATGT TACCACCTTA 2820 

AGAGAACCCT TCAAGTOTGG AGCTTTCTAO TAGQTGTCCT TCATAGCTCA GAAACATACC 2880 

TCAGAACAAG CCATTCATGA CTTACCTGTA ATGGGAAAAT AAATCATTCT ATCAGAAAAA 2940 
AAAAAAAAAA AAAAAAAAAA 

SEQ B) NO:60 PCQ1 Protein sequence 
Protein Accession fc NP_061878 

1 11 21 31 41 51 

I I I I I I 

KSGTKPDILW AFHEVDRFW CDSELSLYHV ESTVNSELKA GSLRLSEDSA ATLLSINSDT 60 

PYHKCVAWYL NYDPECLLAV GQAKGRWLT ELGQDHNSKP KDLIGKBFVP KHASQCMTLA 120 

KOTLDSNWLA AGLDKHRADP SVLIMDICSK YTFDIVPMEK VKLSAGETET TLLVTKPLYE 180 

lyGQNDACLSL CWLPHDQKLL LAGMHKNLAI FDLRHTSQKH FVNTKAVQGV TVDPYFHDHV ■ 240 

ASPYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWCPTRT GLLA3T/TRDS NIIELYDHQH 300 

TPTPIGDBTE PTIIERSVQP CDNYIASFAW HPTSQHRMIV VTPNRTMSDF TVFERISLAW 360 

SPITSLHWAC GRHLYECTEE ENDNSLEKD1 ATKMRLRALS RYGLDTEQVW BNHILAGNED 420 

PQLKSLVKTL HFMKQYTEDM DQKSPGKKGS LVYAGUCSIV KSSLGHVESS EHNWSGLDKQ 4S0 

SDIQNLNEER XLAIiQLCGWI KKGTDV0VGP PLNSLVQEGE WERAAAVALF HLDSUtAIQI 540 
IHEGASSEKG RHSESQCGSN GFIGLYG 

SEQ ID HO.-61 P0G3 DNA SEQUENCE 

Nxleic Acid Accession*: U42359 

Coding sequence: 563-775 (imteSned sequences ccraspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

1TGTACATCT TAACAACCTT AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

GATCAGCCCA CAGTACACAT CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GAGTCCT6GC TTTGTAAAAT GACTTATAAA GGTCCAAGGA TTTAGAGATG ATTAAGAGAT 180 

AAGCTGGCAT TCTGTAAAGG CACCATCGTC TATCCCCTGT CTTATCTAGA TAAAGAATGT 240 

AGTGCTAAAT CTTCTAATAA TATTGTACAA ATGGAAATTC AATCTTAAGO ATTATTTTTT 300 

CCATATTGTT GTATTTCATT GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TGTGATTTGG ACCATGCCAC TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

CTGAGCCTCA GTTTTCCTCA TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCAXAAGAC 480 

AAGTTGTAGT AAATTACTGT TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

TTCCAGTCTT ACATTATTAT GTTTATCTGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

TACCAAAAGA CTGACACGTG GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCCCAA 660 

CAAATATACT TTCTTTAACT TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

AAATATTTTT TAATTCTATC TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCATTAACA 780 

AATAATGTAA GCCTTAATAT TCAAGGCGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 
CTTACTTGAA AACTTT 

SEQ m HO:62 PDG3 Proleln sequence 
Protein Accession*: AAB18375 

1 11 21 31 41 51 

I I I I I I 

HGAHGAPSRR ROAGRRLRYX* PTGSFPFLU. LLLLCIQI^GG GQKKKENLLA EKVEQLHEWS 60 

SRRSIPRMNG DKPRKFIKAP PRNYSMIVHF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 

AFCNKLFFSH VEYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAABQLAK 180 

WIADHTDVKI KVFRFPNYSG TIALALLVSL VGGLLYXRKN NLEFXYNKtG WAMVSfcCIVF 240 

AlfTSGQMWNH ZRGPPYAHKN PHHGQVSYIH GSSQAQFVAB SHIZLVLNAA ITHGHVIXNE 300 
AATSKGOVGK RRIICI/VGLG LWFPFSPLL SIPRSKYHGY PYSDLDFE 

SEQ ID NO-.63 P0G8 DNA SEQUENCE 

Nucleic Add Accession t. AL080235 

Coding sequence: 245-453 (undeilined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGTCGCCGCA CCGGCCGCCT CCGGCCCGCC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGCCCA CCGOGCTOCC AGCCTACCCC GCGGCCGAGC CGCCCGGGCC GCTGTGGCTQ 120 

CAGGGCGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGQTGGCCTQ CTTCATGAOC 240 

CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTGCCCAT CATCGCCGGC 300 

TTCCTGOCCA ACGGCATGGA ACAGOGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCACCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCCGTCA CTTCGGGGGT GGGGACCAAG TOACCCOCTC CGCTCCTCCC TGTGTCCGTC 480 

CTGTGTCCGC ' GCGCGCGGGT GCCTTTCCCG CCGGGGACTC GGCCGGTGTG CTTCGTGCTG 540 

TAGTTATCGT TAGTTCCTCT TCCCGAGATG GGGCCGCCGA GAGACCCCAG CGCCTTTGAA 600 

AAGCAAGGTT TGTGCTGCGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTCCT TCCCCGCACC CACGTGCCCT AGATTCATGG CAGAAAATGA 780 

CCAAATCCTG TGTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

AAAAAATACA AAACAAAAAG ATTAAATTOC TATTGCTGTA GTAAGAGAAG CTCTTTGTAT 900 

CTGAACATAG TTGTATTTGA AATTTGTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 

325 



WO 02/30268 



PCT/US01/32045 



CATGGGAAGG ATTTAACACC GATATATTOT TACCGCTGAA AATGAACTTT ATGAACCTTT 1020 
TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGOGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

SEQ P N0:B4 PDG8 Protein sequence 
Protein Accession I: CAB45781 



10 



15 



1 11 21 31 41 51 

I I I I I I 

GRRTGRLRPA AAPSAAAATA GAPTALPAYF AAEPPGPLWL QGEPLHFOCL DFSLEELQGE 
PGKRLHRKPI ESTLVACFHT LVIWW5VAA LIWPVPIIAG FLPNGMEQRR TTASTTAATP 
AAVPAGTTAA AAAAAAAAAA AAVTSOVATK 

SEQ ID N065PDM1DNA SEQUENCE 

Nudete Add Accession*: NM_00676S 

Cooing sequence 149-1195 (undaftied sequences corespond to start and stop codons) 



£0 
120 



20 
25 
30 
35 
40 
45 



TCCCGGAGGC 
CGOGTGGAGG 
GCAAGCGGGG 

AAAAGTAGAG 
TMATTCCGA 
TGCTCTTCAG 
ACTGGOGAAC 
GGACTATGAT 
CAYGCATTTW 
TGGATTTGCA 
GGTTTTCAGA 
TGGAGGTTTG 
GGCCATGGTG 
CCGTGGAOCT 
GAGCAGCCAG 
CACCATGOGG 
ACGGATAATT 
AATATTTCGT 
TCTGATTTGG 
CAAGTGGGAT 
CTATTTTGAA 
AACTGTGGGT 
ACAAAGGAAA 
CAATAAATGA 



11 
I 

CCGGGTCCCT 
TGGCCGGGCA 
AGACACTCCC 
CGGCGGCTGC 
TGCATCCAGC 
CAGCTGATGG 
AAATTTATAA 
CCTCAGCGGC 
TCCTGGCGCT 
GAGGGGACAO 
CCTOCAAAAG 
GCTGAGCAAC 
CCACCCAACT 
CTTTATTNGA 
TCTCTGTGTA 
CCATATGCTC 
GCTCAGTTTG 
ATGGTTCTTC 
TGCCTAGTGG 
TCCAAGTACC 
ACCATGGCAC 
TTGCATAAAG 
TTCATTCATT 
TTTCCTAGTA 
TATCAAAGTG 
CAATGXAATT 



21 
I 

0GCAAAGCC6 
GGCGTGGTGC 
CTGCCGCGAT 
GGTACCTGCC 
TOGGGGGAGG 
AATGGAGTTC 
AGGCACCACC 
AGTGTTCTGT 
ATTCATCTGC 
ACGTTTTTCA 
GCAGACCTAA 
TAGCAAAGTG 
ACTCTGGTAC 
GAAGGAACAA 
TAGTCTTTGC 
ATAAGAACCC 
TGGCAGAATC 
TAAATGAAGC 
GATTGGGOCT 
ACGGCTATCC 
TTAAAAACTC 
TGAATGTTTA 
TCATTGTGAT 
AATTTAATTT 
TTTTTCAAGC 
A 



31 
I 

CTGCCATCCC 
GCGGTAGGAG 
GGGGGCCCGG 
CACCGGGAGC 
ACAGAAGAAA 
CAGAOGCTCA 
TCGAAACTAT 
GTGCAGGCAA 
TTTTTGTAAC 
GCAGCTCAAC 
GAGAGCTGAT 
GATTGCTGAC 
CATTGCTTTG 
CTTGGAGTTC 
TASGACTTCT 
ACACAATGGA 
ACACATTAIT 
AGCAACTTCG 
GGTGGTCTTC 
TTATAGTGAT 
TATAACCTCA 
CCATGAAGAT 
CAGCTAGCTT 
ACAGAAATCA 
CTGTTATATY 



41 
I 

GGAGGGCCCA 
CTGGGCGCGC 
GGCGCTCCTT 

rm x xrm x 1 

AAGGAGAATC 
ATCTTCCQAA 
TCCATGATTG 
GCTAATCAAG 
AAGCTCTTCT 
ATGAACTCTG 
ACTTTTGACC 
AGAACGGATG 
GCCCTGTTAG 
ATCTATAACA 
GGCCAGATGT 
CAAGTGAGCT 
CTGGTACTGA 
AAAGGCGATG 
TTCTTCAGTT 
CTGGACTTTG 
GCTTTTTAAT 
AAACTGTTCC 
ATTCTTGTGT 
ATGGTAGCAT 
CAGTGTGTKC 



51 
I 

GCCAGCGGGC 
ACGGCTACCG 
CACGCCGTAG 
TTCTCCTGCT 
TTTTAGCTGA 
TGAATGGTGA 
TTATGTTCAC 
AATATCAAAT 
TCAGTATGGT 
CTCCTACATT 
TCCAAAGAAT 
TTCATATTCG 
TGTCGCTTGT 
AGACTGGTTG 
GGAACCATAT 
ACATTCATGG 
ATGCCGCTAT 
TTGGAAAAAG 
TTCTACTTTC 
AGTGAGAAGA 
TAAATGAAGC 
TGACTTTATA 
ACTTTTTTTA 
TTAGTAATCT 
CACAGGATTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



SEQ ID MO:6S PPM1 Protein sequence: 
30 Protein Accession!: NP_O06756 



55 

60 
65 



1 11 21 31 41 51 

I I I I I I 

MGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKXKENLLA EKVEQLKEWS 60 
SEHSIFHHNG DKPRKFIKAP PB1IYSHIVMF TALQPQRQCS VCRQANEEVQ ILANSWRYSS 120 
ATCNKLFFSM VDYDEGTDVF QQUOflJSAPT PXHXPPKGRP KRADTFOLQR IGFAAEQLAK 180 
W1ADRTDVHI RVFRPPNYSG TIAIALLVSL VGGLLYXRFN KLBFITOKTG WAKVSLCIVF 240 
AHTSGQHWNH IRGPPYAHKM PHNGQVSYIH GSSQAQFVAB SHIILVLNAA ITKOtVLLNE 300 
AATSKGDVGK RRIICLVGLG LWPFFSPLL SIFRSKYHGY PYSDLDPE 



SEQ ID NO-.67 PDM2 DNA SEQUENCE 

Nudete Add Accession*; NMJM0947 

Cooing sequence: 88-161 7 (undartlned sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCACCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAO CJTGACCAGAG GAATGCTTOC TACCCTCATT GCCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAACTGA ACAATACCAG 300 

AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTOCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGOCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT CGAGAACAGG AGATTGTTGC CTCATCACCA 600 

AGTTTAAGTG GACTTAAGTT GGGGTTCGAG TCCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 

326 



WO 02/30268 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



AAGGACATTG 
TTAACAGCCA 
CACCTCAGTC 
TCTTTAGATC 
CATAAAGCCT 
TTTCTGAAGG 
ATCAAAGGAA 
AGCTTTGGAA 
CTGTCCAATC 
CAGCTGCTGA 
TTGGATTTAG 
CACAATGTGG 
CAACCTATTC 
CAACCCAAAC 
TCCTC TCTGG 
GTTTTATAAC 
TTGAAAAAGG 
AGCCTTGACC 
CACAGGTGTQ 
GTCTCCCTAT 
AGCOTCCCAO 
TMOCITTTC 
TTATTAGGAA 
AAGGAAAGAG 
TTTTAGGAGA 
ARCAACTTTT 
ATTTTTGTTA 



TGGCAATCAT 
GOTCCTTGCC 
ATTCCTACAC 
AGATTGATTT 
TGCGGGAAAA 
GCATTGGTTT 
AGATGGATCC 
AGGAAGGCAA 
CACCAAGCCA 
AGCAAAAGTT 
TAAAGGCGAC 
ATGATTGTGG 
TAAATGGTGO 
CAAGTGTCCA 
AAATGGATAT 
CCTTTTTCCT 
GTTTCACTGT 
TTCCCAGCTC 
CACCTCATAT 
GTTGCCCAGG 
AGTGCTGGGA 
GTTTAACTTC 
AGGAGGTTTG 
GAGGAGTTTC 
TAAAAACACC 
GTTTTAACTC 
ATAAATATCA 



CCT6AATGAA 
TGCTGTGCA6 
TGGCCAAGAT 
GCTTTCTACC 
TCACCATCTT 
AACTTTGGAA 
AGACAAGTTT 
GAGGACAGAC 
AGSGGATTAT 
GCAGTCATAC 
ACATTACCAG 
CTTTTCTTTG 
TAAAGACATA 
GAAAACCAAO 
GGAAGGACTA 
CAATAGCCTG 
CAOCAAGGCT 
AAGTGATOCT 
CCAGATAATT 
CAGATCTCAG 
TTACAGTTGT 
TCTCTTCACT 
AGGTAACAAC 
TATTAAAATC 
TTTGGGGACT 
TTAATCACTT 
AAGTGT 



SEQ ID NO:6a P0M2 Protein seouencg 
Protein Accession fc np_O0O938 



VSWKGTEQY 
IQQEMDLLRF 
ESIYKIPFAD 
QSDERLQPLL 
LEHGGEMQYG 
DYTPFSCLKI 
QVACQKYFEM 
KDASSALASL 



11 
I 

LRLAGDQRHA 
QSKLESELRK 
RFSILPKDKI 
ALDLFRGRKV 
UHLSHSYTGQ 
LPLKGIGLTL 
ILSNPPSQGD 
IHNVDDCGPS 



21 
I 

SYPHCLQFYL 
LKFSYREKLE 
QDFLKDSQLQ 
YLEDGFAYVP 
DYSTQGNVGK 
BQALQFWKQE 
YEGCPFRHSD 
LKiIPNQFFCE 
LEDYFSEDS 



TTTAGAGCCA 
TCTGATGAAA 
TACAGTACCC 
AAATCCTTCC 
CGTCATGGAG 
CAGGCATTGC 
GATAAAGGTT 
TATACACCTT 
CATGGGTGCC 
AAGATCTCTC 
GTAGCCTGTC 
AATCATCCTA 
AAGAAGCAAC 
6ATGCATCAT 
GAAGATTACT 

TAGTGCAGTG 
CCTACCTCAG 
TTTTTCAATT 
ACTCCTGGGC 
GAGCCACTGT 
GCATCCCAAT 
AGAGACTTTC 
TGTCACTTGA 
GGTTAAAOTC 
TGTAATTTTG 



31 
I 

QPPSENISLT 
DEYEPRRRDH 
FEAISDEEKT 
LKDIVAIILN 
ISLOgtOLLS 
FIKGKMDPDK 
PELLKQKLQS 
SQRILNGGKD 



AACTGTCCAA 
GACTTCAGCC 
AGGGARATGT 
CACCTTGCAT 
GCCGAATGCA 
ASTTCT6GAA 
ACTCTTACAA 
TCAOTTGCCT 
CAITCCGTCA 
CTGGAG6GAT 
AAAAATACTT 
ATCAGTTCTT 
CTATCCAACC 
CTOCTCTGGC 
TEAGTGAAGA 
TTAAGATTTT 
ACACAATTAC 
CCTCCCAAGT 

TCAAGCGATC 
CCCTGGCCTT 
CCATCTACAB 
ACTATATTTT 
GTGATGTCAT 
CCCCAGAAAC 
ACTCAATCCT 



41 
I 

EFENLAIDHV 
ISHFILRLAY 
LREQEIVASS 
EFRAKLSKAL 
TKSPPPCKRQ 
PDKGYSYNIR 
YKISPGGISQ 
IKKEPIQPET 



TCTGCTCAAT 
TGGGAAGATT 
GCGTCAGTTA 
GTATGGCCTA 
GCAA6AATFT 
CATCCGTCAC 
GAAGATTATT 
CAGTGATCCA 
AAGCCAGATT 
TGAGATGATA 
TT6TQAGAGC 
AQAAACTCCT 
CTCTTTAAAT 
TTCTTAGGCA 
OCCTTTGTTG 
AGCTGATTGC 
AGTTAGGACA 
OAGGTGGGGQ 
CTCACACCTC 
TTTTTTTTTT 
GCATGCACAC 
GCTTTGACAG 
TTRAGTCCTA 
TACAATAAAO 
TTTCTGGACC 



51 
I 

KLLKSVENU3 



7S0 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



PSLSGLKLGF 
ALTARSLPAV 
LHKALRENHH 
HSFGKEGKRT 
ILDLVKGTHY 
PQPKPSV0KT 



1 
I 

AATTCATACA 
GTCTCGGCTC 
GTGTGGGAAG 
AGAGAAGCCC 
T6CACATCA0 
CTTCATTCAG 
TATATGCAAT 
TACTCACACT 
GACA'AXaTX'l'A 
GTGTGGAAAA 
AGAGAAACCC 
CAGACATCGG 
TTTCTCCCAC 
TGTAGGTTCA 
TGATCTCATA 
AGCTCAGACC 
CCAGGCTGTT 
ATATGAATGC 
AACACAGAGG 
TAATAAGCAT 
GAAGATAGAT 
GAAATATAAT 
GGTXTACACA 
CTAGTGGTAC 
GTAACTAGAA 
AAGGAGTATT 
AGGATGTGTA 
AAAAGGGGTT 
OCCTTTTTTG 
TTTCTTTGAT 



11 
I 

GGAGAGAACT 
ATTAATCATC 
GCCTKTCCA 
TAXGAATGCA 
AAAGCTCACA 
AAGGGAAATC 
GAATGTGGAA 
GGAGAGAAAC 
ATATCCCATC 
TCCTGCTCAC 
TATACATGCA 
AGAACTCATA 
TTGTCMGCC 
GTCAAATTGG 
CAGGATAAAG 
TCATTARCTA 
GCCAGAAGTT 
AGTGAATGTG 
AACAAACTGA 
ATACTCAGAG 
CTTCTCATCA 
GATCATGGAA 
GGAGAGAAAC 
ATTCTGCCTT 
CATCTTCATC 
TTAGAGATTT 
TTTTAGGACA 
GTCAGTGTTA 
ATAAGAGTCT 
TCCAAATTTC 



21 
I 

CATATATATG 
AGAGAGTTCA 
AAAGGTCCAG 
CTGAATGTGA 
CAGGAGAGAA 
TCATTGTRCA 
AAGGCTTCAT 
CCTATGAATG 
ASAGATTTCA 
ACAAGTCAGG 
GTGACTGTGG 
CAGGGGAGAG 
TTGTTTATCA 
AAAATCCTTG 
ACTCTGTTAA 
ACAGTGCGTT 
CAGTCTCAGC 
GTAGTGCTTT 
TATATTCAAG 
AAAAATAGTA 
GTGACCATAG 
AAGTCCTTGT 
TTTTGGAAGA 
ATCCTCAGAG 
AAAATATGAA 
CGATCAGAAA 
AXATACCTTG 
CACATCATTG 
TCTATTCCCA 
TTCACTTGTT 



31 
I 

CAGTGATTGT 
TACAGGAGAG 
GdCACTGAA 
CAAAGCATTC 
GTCATATATA 
TCAGCGAATT 
CCAAAAGGGC 
CAATGAATGT 
CACAGGAAAG 
TCTCATTAAC 
GAAAGCTTTC 
ACCGTATGGA 
TAAGGGAATG 
CTCAGAGAGT 
CATGGTGACT 
CCAAGCAGAG 
AGATAGTAGA 
CAGTGATCAA 
GTGGAAAGCC 
TGAAGTGGAG 
ATCACATCTT 
TCAGAAACAG 
CCTTTGAAGG 
GGAATCATAT 
AGAACACACG 
TCTAACATCA 
AATCACTAGT 
GTTAAATTTA 
ACCAAGATCA 
ATTTCAGACT 



41 
I 

GGAAAAGGCT 
AAACCACATG 
CACCAGAGAA 
OGCTCGAAAT 
TGCCGTGATT 
CATACTGGAG 
AACCTCCTTA 
GGGAAAGGCT 
ACACCCTTTG 
CACCAGAGAA 
AGAGATAAAT 
TGCTCTGATT 
CTGCATGCAA 
CATAGCTTAT 
CTGCAGATGC 
AGCAAAGTAG 
ATTTGCACAG 
TTACATCATA 
CTTGAATAAA 
ACTGGGAAAT 
CAGTGAGCTT 
TACGCCAGTA 
CTATGAATGT 
AGAAATAAAA 
AAGCAAAXAA 
TTATATGGCA 
TGATATGTCA 
TAGCACAATG 
TTATATGATT 
ACTGAAGCTC 



60 
120 
180 
240 
300 
360 
420 
480 



SEQ 10 WM9 PDM3 DNA SEQUENCE 

Nucleic Add Accession I: NMJK4840 

Cooing sequence; 103-491 (underSned sequences correspond lo start and slop codons) 



51 
I 

TCATCAAGAA 
GATGCAGCCT 
CTCATACAGG 
CACAGCTCAA 
GTGGAAAAGG 
AAAAACCCTA 
TTCATCGACG 
TCAGCCAGAA 
TATGTACTGA 
TTCACACAGG 
CATGTCTCAA 
GTGGGAAAGC 
GAGAGAAATG 
CACATACACG 
CTTCTGTGGC 
CCATTGTGAG 
AMAAAAACC 
TGTCACAAAA 
ACCTTATGGC 
TCTTTTATGG 
ATAGTTGGEA 
GGTATCAGGG 
GGCAGGGTTG 
CTATGAAAAT 
GCCCTGTGAA 
GATAATATAC 
ATGACTAATT 
TACCTCTTCC 
AGCTCTTGTG 
TTCAAAAGGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



327 



WO 02/30268 



PCT/US01/32045 



AAAATGTATT TAATTTAATA ATGTAACACA ACRAOTTTGO ATGTQTTTAA CTTTATAAAT I860 
AATCACCCCA GAGGAATGAA GTTCAAAACT TGTGAATAAC C 

5 SEP IP KfrTO PDM 3 Protein sequence: 
Protein Accession!: NPJD79116 

1 11 21 31 41 51 

1A 1 1 > 1 1 1 

1U KDAACVGRPS PKGPGSLNTR ELIQERSPXN ALNVTKKSJU3 NHSSHHXRKL TQERSHHOW 60 
IVBKASFRRE ISLYISEFIL BKNPXYAMNV EKASSKRATS LFIDVLTLER OTHNAMNVGK 120 
ASARRHV 

, . SEQID NO:71 POMS DNA SEQUENCE 

15 NudelcAckl Accession*: NM_pl845S 

Coding sequence 341-055 (underlined sequences correspond to start and stop codons) 



20 

25' 



1 11-21 31 41 51 

I I I I I I 

AATTTCGGCA CGGGGGOGAX3 GCACAGTGAG TCCACTGGGG CACGGCAGCO TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GAGCCTATGT GTAAGTCCAC TACT6GTGCA 120 

AGGTTGCACA CTTCTAAGAA GAGCGGCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCCCCGTGCA GTCCCCTGTG CCCAAGACAC AGCCTGATGC TTGTGCTCCG GTGGGCGGAC 240 

TTGGAGGCGG CGGGAACTGC AATTGGTGGC TTTGAAGG6C GGCGAGCGGG AACAGCTCTT 300 

GAGGAGTGAG ACTGCAOGAQ ATGTGGGCCG TGCCAAAGAG ATGGATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCOC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

CTGGGATTTT TTGTCTGAAA ATCAACTGCA GACTGTAAAT TT0C6ACAGA GAAAGGAATC 480 

TGTAGTTCAG CACTTGATCC ATCTGTGTGA GOAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

CCTGTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

30 GATGAGTAAA GGACCAGGTG AAGATQTTGA OCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

GTTCAAGAAA ATTCTTCAGA GAGCATTAAA AAATGTGACA OTCAGCTTCA GAGAAACTGA 720 

GGAGAATGCA gtctggattc GAATTGCCTG gggaacacag TACACAAAGC CAAACCAGTA 780 

CAAACCTACC TACQTGGTGT ACTACTCCCA GACTCCGTAC GCCTTCACGT CCTCCTCCAT 810 

_ GCTGAGGCGC AATACAOCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCTACCT 900 

35 CGGACAAGAG. GAGATCATTT TAGATATTAC CGAAAT6AAG AAAGCTTGCA ATTAGTGAAC 960 
ATGAAAGQAA AATAAAAATT CCTCACAGTC AAAAAAAAAA AAAAA 



40 



45 



SEQID NO-.72 PDMS Protein sequence: 
Protein Accession I: NP_060925 

1 11 21 31 41 51 

I I I I I I 

BDBTVAEPIK RTILKIPHNE LTTILKAWDF LSENQLQTVN FRQRKESWQ HLIHLCEEKR 60 
ASISDAALLD IIYMQFHQHQ KVWDVFQKSK GPGEDVDLFD KKQFKNSFKK ILQRALKNVT 120 
VSPRETEEHA VWIRXAHGTQ YTKPNQYXPT YWYYSQTPY AFTSSSMLRR NTFLLGQELB 180 
ATGKIYLRQE EIILDITEKK KACN 



, SEQ ID N0".73 P0M9 DMA SEQUENCE 

Nuelete Add Accession ft NMJ016192 
50 Cooing sequence: 1-1 125 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

cc AieOTGCTCT GGGAGTCCGC GCGGCAGTGC AGCAGCTGGA CACTTTGCGA GGGCTTTTOC 60 

55 TGGCTGCTGC TGCTGCCCGT CATGCTACTC ATCGTAGCCC GCCCGGTGAA GCTCGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGGCAAACG CCCAOCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCTCTGTGAC ACCAACACCT GTAAATTTGA TGGGGAATGT 240 

TTAAGAATTG GAGACACTGT GACTTGCGTC TGTCAGTTCA AGTGCAACAA TGACTATGTG 300 

,„ CCTGTGTGTG GCTCCAATGQ GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 360 

60 TGCAAACAGC AGAGTGAGAT ACTTGTGGTG TCAGBAGGAT CATGTGCCAC AGAIGCAGGA 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GCAGACATOC 480 

ACCTGTGATA TTTGCCAGTT TGGTGCAGAA TGTGRCGAAG ATGCCOAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC TCTGCGCTTC TGATGGGAAA 600 

, _ TCTTATGATA ATGCATGCCA AATCAAAGAA GCATCGTGTC AGAAACAGGA GAAAATTGAA 660 

65 GTCATGTCTT TGGGTCGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAGAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTCTGCA TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGCC ATCTTGCAGG TGTGATGCTG GTTATACTGO ACAACACTGT 900 

GAAAAAAAGG ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

70 TTAATCGCAG CTGTQATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACARC AAGAGCGTCC ACGAGGTTAA TCTGA 



328 



WO 02/30268 



PCT/US01/32045 



SEQ 10 WO;74 PDM9 Protein teaienca 
Protein Accession*: NPJ057276 

„ 1 11 21 31 41 51 

5 I I I I I I 

1 MVLWESPRQC SSVfTLCEGFC WLLLLPVMLL ZVARFVKLAA PPTSLSDCQT PTGHNCSGYD SO 

61 DRENDLPLCD TNTCKPDGEC LRIGOTVTCV CQFKCNNDYV FVCGSNGESY QNECYLRjQAA 120 

121 CKQQSBILW SBGSCATDAG SGSGDGVHEG SGETSQKETS TCOICQFGAS CDEDAEDVWC 180 

tA 181 VCHUCSQTN FNPLCASDGK SYDNACQIKB ASCQKQBKIB VMSLGRCQDN TTTTTKSEDG 240 

10 241 HYARTDYAEN ANKLEESARE HHIPCPBHYH GPCHHGKCEH SIMKQEPSCR CDAGYTOQHC 300 

301 EKKEYSVLYV VPGPVRFOW LIAAVIGTCQ IAVICVWLC XTRKCPRSNR IHRQKQNTGH 360 
361 YSSDNTTRAS TRLI 



SEQ ID NO:75 PD01 DNA SEQUENCE 

15 NudeteAcMAccesslon*: NMJ014324 

C<x£no sequence: 89-1237 (underBned sequences correspond to start and slop radons) 

1 11 21 31 41 51 

on I I I I II 

ZU GGCGCCGGGA TTOGGAGGGC TTCTTGCAGG CTGCTGGGCT GGG6CTAAGG OCTGCTCAGT 60 

T T C CTIC AGC GGSGCACTGG GAAQCGC CAT GG CACTGCAG GGCATCTCGG TCGTGGAGCT 120 

GTCCGGCCTG GCOCCGGGCC GTOTCTGTGC TATGGTCCTG GCT6ACTTCG GGGCGCGTGT 180 

GGTACQCGTO GACCGGCCCG GCTCCCGCTA CGACGTGAGC CGCTTGGGCC GGGGCAAGCG 240 

_ CTCGCTAGTG CTGGACCTGA AGCAGCCGCG GGAGCCGCGT GCTGCGGCGT CTGTGCAAGC 300 

25 CGTCGGATGT GCTGCTGGAG CCCTTCCGCC GCGGTGTCAT GGAGAAACTC CAGCTGGGCC 360 

CAGAOATTCT GCRGCGGGAA AATCCAAGGC TTATTTATGC CAGGCTGAGT GGASTTQGCC 420 

AGTTCAGGAA AGCTTCTGCC GGTTAGCTGG CCACGATATC AACTATTTGG CTTTGTCAGG 480 

TGTTCTCTCA AAAATTGGCA GAAGTGGTGA GAATCCGTAT GCCCCGCTGA ATCTCGTGGC 540 

TGACTTTGCT GGTGGTGGCC TTATGTCTGC ACTGGGCATT ATAATGGCTC TTTTTGACCG 600 

30 CACACGCACT GACAAGGGTC AGGTCATTGA TGCAAATATG GTGGAAGGAA CAGCATWTT 660 

AAGTTCTTTT CIGTGGAAAA CTCAGAAATC GAGTCTGTGG GAAGCACCTC GAGGACAGAA 720 

CATGTTGGAT GGTGGAGCAC CTTTCTATAC GACTTACAGG ACAGCAGATG GGGAATTCAT 780 

GGCTGTTGGA GCAATAGAAC CCCAGTTCTA CGAGCTGCTG ATCAAAGGAC TTGGACTAAA 840 

GTCTGATGAA CTTCCCAATC AGATGAGCAC GGATGATTGG CCAGAAATGA AGAAGAAGTT 900 

35 TGCAGATGTA TTTGCAAAGA AGACGAAGGC AGAGTGGTGT CAAATCTTTG ACGGCACAGA 960 

TGCCTGTGTG ACTCCGGTTC TGACTTTTGA GGAGGTTGTT CATCATGATC ACAACA&GGA 1020 

ACGGGGCTGG TTTATCACCA GTGAGGAGCA GGACGTGAGC CCCCGCCTTC CACCTCTGCT 1080 

GTTAAACACC CCAGCCATCC CTTCTTCCAA AGGGGATOCT TTCATAGGAG AACACACTGA 1140 

GGAGATACTT GAAGAATTTG GATTCAGCCG AGAAGAGATT TATCAGCTTA ACTCM3ATAA 1200 

40 AATCATTGAA AGTAATAAGG TAAAAGCTAG TCTCTAACTT CCAGGCCCAC GGCTCAAGTG 1260 

AATTTGAATA CTQCATTTAC AGTGTAGACT AACACATAAC ATTGTATGCA TGGAAACATG 1320 

GAGGAACAGT ATTACAGTGT CCTACCACTC TAATCAAGAA AAGAATTACA GACTCTGATT 1380 

CTACAGTGAT GATTGAATTC TAAAAATGGT TATCATTAGG GCTTTTGATT TATAAAACTT 1440 

c TGGGTACTTA TACTAAATTA TGGTAGTTAT TCTGCCTTCC AGTTTGCTTG AIATATTTGT 1500 

45 TGATATTAAG ATTCTTGACT TATATTTTGA ATGGGTTCTA GTGAAAAAGG AATGATATAT 1560 

TCTTGAAGAC ATCGATATAC ATTTATTTAC ACTCTTGATT CTACAATGTA GAAAATGAGG 1620 

AAATGCCACA AATTGTATGG TCATAAAAGT CACGTGAAAC AGAGTGATTG GTTGCATCCA 1680 

GGCCTTTTOT CTTGGTGTTC ATGATCTCCC TCTAAGCACA TTCCAAACTT TAGCAACAGT 1740 

_ SATCACACTT TGTAATTTGC AAAOAAAAGT TTCACCTGTA TTGAATCAGA ATGCCTTCAA 1800 

50 CTGAAAAAAA CATATCCAAA ATAATGAGGA AATGTGTTGG CTCACTACGT AGAGTCCAGA 1860 

GGGACAGTCA GTTTTAGGGT TGCCTGTATC CAGTAACTCG GGGCCTGTTT CCCCGTGGGT 1920 

CTCTGGGCTG TCAGCTTTCC TTTCTCCATG TGTTTGATTT CTCCTCAGGC TGGTAGCAAG 1980 

TTCTGGATCT TAXACCCAAC ACACAGCAAC ATCCAGAAAT AAAGATCTCA GGACCCCCCA 2040 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



55 



SEQ ID N036 PDOI Protein sequence; 
Protein Accession #: NF.0S5139 



1 11 21 31 41 51 

60 | I.I I I I 

1 HALQGISWE LSGLAPGBXC AHVLADFGAR WRVDRPGSR YDVSRLGRGK RSLVLDLKQP 60 
61 KBPRAAASVQ AVGCAAGALP FRCHGETPAG PRDSAAGKSK AYIXOAEWXW PVQESFCRLA 120 
121 GHDINYLALS GVLSKXGKSG ENPYAPLHLV ADPAGGGLMC ALGZIHALFD RTRTDKGQVI 180 
181 DANMVEGTAY LSSFLWKTQK SSLWEAPRJSQ NMLDGGAPFY TTYRTADGEF MAVGAIEPQF 240 
65 241 YELLIKGLGL KSDELPHQMS TDDWPEMKKK FADVFAKKTK ABWCQIPDGT DACVTPVLTF 300 

301 EEWHHDHNK ERGSFITSES QDVSPRLAPL LLNTPAIPSS KGOPFXGEHT EEILEEFGPS 360 
361 REEIYQLHSD KIIESNKVKA SL 

SEQ ID NO:77 P003 DNA SEQUENCE 

70 Nucleic Add Accession f. AB028951 

Coding sequence: 97-1 12S (undefined sequences correspond to start and stop cottons) 



1 11 21 31 41 51 

n e I I I I I I 

O GTTAAATCCT TACTTTACCA GATTCTTGAT GGTATCCATT ACCTCCATGC AAATTGGGTG 60 

CTTCACAGAG ACTTGAAACC AGCAAATATC CTAGTAATGG GASAAGGTCC TGAGAGGGGG 120 

AGAGTCAAAA TAGCTCACAT GGGTTTTGCC AGATTATTCA ATTCTCCTCT AAAGCCACTA 180 

GCAGATTTGG ATCCAGTAGT TGTGACATTT TGGTATCGGG CTCCAGAACT TTTGCTTGGT 240 

GCAAGGCATT ATACAAAGGC CATTGATATA TGGGCAATAG GTTGTATATT TGCTGAATTG 300 

80 TTGACTPCGG AACCTATTTT TCACTGTCGT CAGGAAGATA TAAAAACAAG CAATCCCTTT 360 
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CATCATGATC AACTGGATCQ GATATTTAGT GTCATGGGGT TTCCTGCAGA TAAAGACTGG 420 
GAAGAIATTA GAAAGAXGCC AGAATATCCC ACACTTCAAA AAGACTTTAG AAGAACAACG 480 
TATGCCMCA GTAGCCTCAT AAAGTACATG GAGAAACACA AGGTCAAGCC TGACAGCAAA S40 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA CCAAGAGAAT TACCTCGGAO 600 
CAAGCTCTGC AGGATCCCTA TTTTCAGGAO GACCCTTTGC CAACATTAGA TGTATTTGCC 660 
GGCTGCCAQA TTCCATACCC CAAACGAQAA TTCCTTAATQ AAGATGATOC TQAAQAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTOCACAG 780 
CAGGCAGCAO CCOCTCCACA GGCGCOCCCA CCACAGCAGA ACAGCACCCA GACCAACGGG 840 
AOOGCAGGTG GGGCTGGCGC CGGGGTCGGG GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 
GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 
AACTCAGGTO GACCTGTGAT GOCCTCGGAT TATCAGCACT GCAGTTCT03 CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA CCCATCTCAC CAGGCCCACC GGTACTGACC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAQG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 
CAAAAAAATG CAAACXATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 
1ACTGAGCAS TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
ASTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTCTG ACTTCTCTGA TAAAGCQTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGCACC 1440 
TTTCTCAXGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 
AATGTTITAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACSTCATGAA CTGATTXAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACASACAAXA ACTCTTCTGC 1620 
TCATQGAAGT CTTAAACAGA AACTGTTACT GTCCCAAAGT ACTTTACTAIF TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AAOCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAASA ACTGAAGTAT 1800 
GTGCTAATTT TTAIAGAATC ATTCATATGG AACTGAGTTG CCAGCATCAT CTTATTCTGA 1860 
ATAGCADTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GCAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTtJA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAQGAAAAO TTTTTAGTGT GTATTGTACA TAAAGTCGGC TTCICTAAAG AACCATTGGT 2100 
TTCTTCACAT CTGGGTCTGC GTGAGTAACT TTCTIGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAOTGT TCTCCATTCG TATTTGTATT 2220 
AGATATAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATGTATTAT GTATGCATAA TTTTGCTGTT GTTACTGAAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTCC ACTGAAXGAX TTCTTTTGOC COTAGGAGAA AACTTAATAA TTGTGOCTAA 2400 
AAACTAIGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCAITA 2460 
1CTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAOCOOCTC ACTOCCCAGA 2S20 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTOTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCACCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAAZAGCTGA TTATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTAT? CTTTCTGAAC 2700 
TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 
CCAGTATACA TTTTGCACTA TTGATGTGAT ACTGTAGCCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2880 
GAAGCAGGTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCIG TTTTACCCCT 2940 
TOCATTTTTT AAAATAA6AA ATTAGCAGCC CTCTGCATAA TGTAGCTQCC TATATGCAGT 3000 
TTTATCCTGT- GCCCEAAAGC CTCACTGICC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 
CCCTCACCAT GTGCCTGGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGCCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT AJGAAGGTCT TTTTGTTTAC TTCTAAACOC 3240 
ACTT GGG TAG TTACTATCOC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 
GTCAGTCTAC CTTAGAGAAA GCCAGTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TCCTAAGTGG TTGTAGGAAA TTGCAAGGAA AAXAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCCAG ATAGATATGT AAACCAAAAT ACTATGCCOC 3540 
TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 
CCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTOOCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACACATGAC TGAGTTCCTA TCCCTGCACT GGTACTGGCT CTTTTCTCCT CTTTCCTTGC 3780 
CTCAG GG TTC GTGCTACOCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 
CTTTCCTTTA AAGGGGAACA AAGCC TTTTT TTTTTTIGAG ACGGAGTGTT GCTCTGT CA C 3900 
CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC GGGCACGCAC CACCACGTCT 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 4080 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTCCCG AAGTGCTGGG ATTATAGGTG 4140 
TGAGOCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG CCCTCAAACC 4200 
GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 
CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATGTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAAGTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCW TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 
CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 
AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATCCTTG TGATGAAGTA 4560 
GATCATGCAG TGACATACAA AGACCAAGGA TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC TTTTGCTCTT ACGAACATAA 4680 
TGGACTCTTA AGAATGGAAA GGGATGACAT TTACCTATGT GTOCTGCCTC ATTCCTGGTO 4740 
AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 
AAAGAAAAGA AAAAAATAGT TGGAAAATAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 
TTEATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TCTATGATGA TCTCTTCCTA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGCTTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGGTCCC S040 
CAIAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TITTTAAAAA 5100 
ACGTCTTGTA OCTAAGGOCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 
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5 
10 
15 
20 

25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATGTATTAT 
TTAATACTAT 
ATCTGOACTQ 
AGTATAfCCT 
ACCTGTTCTT 
GAGAIGACTG 
AGCGAGGCCT 
TTTTGAGTTG 
CATTTATTTT 



ATAAAAAAAA 
TTAATTTTTT 
AAGGTGTCCT 
TTCTAAACTG 

GTCTcrrrTT 

TAGCTTTTCG 
GCTOCAHGGA 
ACCTGACTTC 
ATATICTTGG 



AAACCCTTAA 
TAAAGATTTO 
TTTTAACAAC 
CCTAGTTTGT 
TCAGTCATTT 
TGCTCCACTG 
GTGCAGOACG 
CTTCTTGAAA 
TTGAAAIAAA 



TGCACTGTTA 
TCTGTGTAGA 
AATTTAAAGT 
ATATTCCTAT 
TCTCCACGCA 
CGAGOTTTGT 
AGCTACTGCT 
TGACTGTTAA 
ATTTAATTGA 



SEQ ID HOTB PD03 Protein sequence 
Protein Accession t. BAA82980 



11 



21 



TCTCCTAAAT 
CACTAAAAGT 
ACTITTTATA 
AATTCCTATT 
TCCCCCTTTA 
GCTCAGAGCC 
TTGGAGCGAS 
AACTAAAATA 
CTTTQ 



41 



ATTTAGTAAA 
ATTACACAAA 
TATGTTATGT 
TGTGAAGTGT 
TATGGTTATA 
GCTOCWXCC 
G0TTT0CT3C 
AATTACATTG 



51 



5220 
52B0 
5340 
5400 
5460 
5520 
5580 
5640 



31 

I 1 I I I I 

VKSIXYQIU) GIHYLHANWV LHHDLKPANI LVHGEGPKRG RVKIADM3FA RLFNSPLKPL 
ADLDPVWTF WYRAFEU.LG AHHYTKAIDI WAIGCIFAEL LTSEPIFHCR QEDIKTSNPF 
. BHDQLDRIPS VHGFPADKDW EDIRKMPEYP TLQKDPRRTT YANSSLIKYH EKHKVKPDSK 
VFLLLQKLLT HDPTKRITSS CALQDPYFQE DPLPTU3VFA GCQIPYFKRB FUJEDDPBBK 
GDKHQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTNO TAGGAGAGVG GTGAGLQHSQ 
DSSUJQVPFN KKPRLGPSGA HSGGPVMPSD YQHSSSRLNY QSSVQ6SSQS QSTLOYSSSS 
QQSSQYHPSH QAHRY 

SEQ ID NO:79P0O5DNA SEQUENCE 

Nucleic Add Accession* XML002922 

Coding sequence: 1-2190 (undefined sequences conespond to start end stop codons) 



60 
120 
180 
240 
300 
360 



ATGAATCCTT 
GAGGTACCAC 
AACTATCCAC 
TATGGAATGA 
ACCTCCACAT 
GCAGCCATTG 
TATGTGCTTG 
GTACACACAG 
AAACCCTGTG 
ACTAGATACT 
ATCACACCCA 
TTTGGAGTTC 
ATATACAATA 
TTTGCTATTT 
CTAGACTGGG 
AGGGTACTAT 
TCAOGATGGA 
CCGGACCAGA 
TTTGTCATTT 
GCTGTTGGTA 
ATAAATGAAA 
CTGGCAGATG 
GAGTCCATCA 
AGCCAGGATT 
GTGCAGGAGA 
ATGATGGTAA 
AACACTTTGG 
GAAGACTATG 
TGTAQAACAG 
TATCTGTTTG 
ATTCCAGCCA 
GGGGAGGTCA 
ATGAAATCTG 
CTTGTTGTGG 
CTCCTGCTGG 
ACA6AGGATA 
AAACTAGAGA 



11 
I 

TCCAGAAAAA 
CTCGACCAOC 
TGAGCATTGC 
AAGCTGTGCT 
CTATATAOCA 
CTGACTCGTG 
GCCATGTGAT 
TCCTATCATT 
TGGCAGCTTT 
TCTCAGTCTT 
TGCTGAGAGG 
CA6GACT6CT 
AACCACCCCC 
CCAATCGTTT 
CAGCTGAGAA 
TCCTTTATAT 
CTTTGCAAGC 
TGCAGGTTCT 
ATCGTCTGGT 
TGATCCTAGC 
TGGCCCCAGC 
ATGAG6TGAA 
AATCCTTTCA 
TTCACTTCCA 
AGAACTGGTA 
AGOATACAGA 
ATAAAGATGT 
GTGTGTCTGC 
AAGAXAAGAA 
TTATTACTAA 
ACAAAATGTC 
TGTTCTCTGT 
TGCTCCAGGC 
CACAGTTCAG 
TGATCTGCCT 
TGCGGGGTCC 
CCAAGAAGAC 



21 
I 

TGAGTCCAAG 
TACCCCTCCA 
CTTCATTGTG 
GATCCTGTAT 
TGCCTTCAGC 
GTTGGGAAAA 
CAAGTCCTTG 
GATCGGCCTQ 
TGGTGGAGAC 
CTACCTGTCC 
AGATGTCCAA 
CATGGTAATT 
TGAAGGAAAC 
CAAOAACCGT 
ATATCCAAAG 
CCCATTGCCC 
CATCAGGATQ 
AAATCCCTTT 
CTCCAAGTGT 
GTGCC7GGCA 
CCAGTCAGGT 
GGTGACAGTG 
GAAAACAOCA 
CCTGAAATAT 
CAGTCTTGTC 
AAGCAAAACA 
CAACATCTCC 
TTATAOAACT 
CTTTTCTCTG 
TAACAOCAAT 
CATTGCGTGG 
CACAGGTCTT 
ACCTTGGCTA 
TGGCCTGGTA 
GATCTTCTCC 
AGCAGATAAG 
AAAACTCTGA 



31 

I 

GAAACTCTTT 
AASAASCCAT 
GTGAATGAAT 
TTCCTGTATT 
AGCCTCTGTT 
TTCAAGACAA 
GGTGCCTTAC 
AGTCTAATAG 
CAGTTTGAAG 
ATCAATGCAG 
TGTTTTGGAG 
GCACTTGTTG 
ATAGTGGCTC 
TCTGGAGACA 
CAGCTCATTA 
ATGTTCTGGG 
AATAGGAATT 
CTGGTTCTTA 
GGAATTAACT 
TTTGCAGTTG 
CCCCAGGAGG 
GTGGGAAATG 
CACTATTCCA 
CACAATTTGT 
ATTCGTGAAQ 
ACCAATGGGA 
CTGAGTACAG 
GTGCAAAGAG 
AATTTGGGTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTGACAATTG 
CAGTGGGCCG 
ATCATGGGCT 
CACATTCCTC 



41. 
I 

TTTCACCTGT 
CTCOGACAAT 
TCTGCGAGCG 
TCCTGCACTG 
ATTTTACTCC 
TCATCTATCT 
CAATACTGGG 
CTTTGGGGAC 
AAAAACATGC 
GGAGCTTGAT 
AACACTGCTA 
TGTTTGCAAT 
AAGTTTTCAA 
TTCCAAAGCG 
TGGATGTAAA 
CTCTTTTGGA 
TGGGGTTTTT 
TCTTCATCCC 
TCTCATCACT 
OGGCAGCTOT 
TVTTCCTACA 
AAAACAATTC 
AACTGCACCT 
CTCTCTACAC 
ATGGGAACAG 
TGACAACCGT 
ATACCTCTCT 
GAGAATACCC 
TTCTAGACTT 
AGGCCTGGAA 
AATATGCCCT 
ATTCTCAGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 




CTTTTCCTAT 
GAATGAAGAT 
CATCCTGGGA 
CTCCTTGGTG 
AGGACAAGTG 
AGGAGGCATC 
AGAGGAACGG 
TTCTACATTT 
TGCATTGGCT 
GGGAAGCAAA 
ATGTATCTGG 
ACAGCACTGG 
GGCACTGACC 
TCAGCAQGGT 
TGTGCTTCAG 
GTTGTTTGAC 
TAGGAAAATG 
AGAGATAAAA 
AGTCTTGAAT 
TCTGTTGAIA 
GAAAACAAAA 
TGAGCATTCT 
TATCTCCAGC 
GAGGTTTGTT 
CAATGTTGGT 
TGCAGTGCAC 
TGGTGCAGCA 
GATTGAAGAC 
GGTTACAGCT 
TCCCTCTAGC 
TATCATCGTG 
GTTTTCCTGC 
TCCTGTAAAG 
GAACATGATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
I960 
2040 
2100 
2160 



SEQ ID N0:80 PD05 Protein seouence 
Protein Accession I: xp.002922 



HNPFQKNESK 
YGKKAVLILY 
YVLUHVIKSL 
TRYFSVPYLS 
IYNKPPPEGN 
RVWLYIPLP 
FVIYRLVSKC 
LADDEVKVTV 



11 
I 

ETLFSEVSIE 
FLYFLHWNED 
GALPILCGQV 
INAGSLISTF 
IVAQVFKCIW 
HFWALLDQQG 
GINFSSLKKM 
VGNENNSliLI 



21 
I 

EVPPRPPSPP 
TSTSIYHAPS 
VBTVLSLIGL 
ITFMLHGDVQ 
FAISNRFKNR 
SRWTLQAIHM 
AVGMILACLA 
ESIKSFQKTP 



31 41 51 

I I I 

KKPSPTICGS NYPLSIAFIV VNEFCERFSY 
SLCYFTPILG AAIADSWLGK FKTIIYLSLV 
SLIALGTGGI KPCVAAPGGD QPEEKHAEER 
CFGEDCYALA PGVPGLLH71 ALWPAHGSK 
SGDIFKRQHW LDWAAEKYPK QLIMUVKALT 
HHNtGPFVLQ PDQHQVUIPF LVLZFIPLFD 
FAVAAAVEEK IHEMAPAQSG PQEVPLOVIiJ 
HYSKLHLKTK SQDFHFHLKY HKLSLYTEHS 
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VQEKHWYSLV IRE0GN5ISS HMVKDTBSKT THGMTTVKFV NTI2KDVHIS LSTOTSLHVG 540 

EDYGVSAYKT VQHGBYPAVH CRTEDKNFSL KLGLLDFGAA YLFVTIKNTN QGLQAKKIED 600 

IPANKHSIAVf QLPQYALVTA GBVBFSVTGL EFSYSQAPSS MKSVLQAAWL LTIAVGHIIV 660 

LWAQFSGLV QKAEPILFSC LLLVICLIFS IMGYYWPVK TEDHRGPADK HIPHIQGNHI 720 
KLETKKTKL 

SEQ ID NO-.81 POOS DNA SEQUENCE 

fodelc Add Accession* NMJK0448 

Coding sequence: 1-1221 (indeiOned settees correspond to start and stop codans) 
1 11 21 31 41 51 

I I I I I I 

ATGGACGGAT CCCACAGCGC AGCCCTGAAG CTGCAGCAGC TGCCTCOCAC AAGTAGCTCC 60 

AGCGOCGZAA GCGAGGCCTC CTTCTCCTAC AAGGAAAACC TGATTGCCGC OCTCTTGGCO 120 

ATCTTCGGGC ACCTCGTGGT CAGCATTCCA CTTAACCTCC AGAAGTACTG CCACATCCGC 180 

CTGGCAGGCT CCAAGQATCC CCGGCCCTAT TTCAAGACCA AGACATOOTO GCTGGGCCTG 240 

TTCCTGATOC TTCTGGSCGA GCTGGGTGTG WOGOCTCCT AOSOCITOSC GCCGCTCTCA 300 

CTCATCGTOC CCCTCAGCGC AGTTTCTGTG. ATAGCTAGTG CCATCATAGG AATCATATTC 360 

ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT AOGTC T T Q TC CTT TGTTGG C 420 

TGCGGTTTGG CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT GGCCTOTCCT CTTCTACATG 540 

CTGGTGGAGA TCATTCTCTT CTGCTT6CTG CTCTACTTCT ACAAGGAGAA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT OCASGACAGT GGTGACAGTC 660 

AAGGCCGTGG CTGGGATGCT TCTCTTQTCC ATTCAAGGGA ACCTGCAGCT TGACIACOCC 720 

ATCTTCTAOG TGATGTTCGT GTGCATGGTG GCAACC6CCG TCTA3CAGGC TGCGTTTTTO 780 

AGTCAAGOCT CACAJ3ATGTA OGACTCCTCT TTGATTGCCA GTQTGGGCTA CATTCTGTCC 840 

ACAACCATTG CTATCACAGC AGGTGCAATA TTTTftCCTGO ACTTCATCGG GSAGGACGTG 900 

CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TC T TG G GCGT CTTCTTAATC 960 

ACGCGTAACA GGAAGAAGCC CATTCCATTT GAGCCCTATA TTTCCATGGA TOCCATGCCA 1020 

GGTATGCAGA ACATGCAC6A TAAAGGSATG ACTGTOCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TSCCACOCTG 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTOCTA 1200 
GAGCACACCA AGAAGGA ATO A 

35 SEQ ID KO:B2 POOS Protein sememe 
Protein Accesskmt NP_065181 

1 11 21 31 41 51 

ai\ I I I I I • 

4U HDGSHSAALK WQLPPTSSS SAVSEASFSY KENLIGALLA IFGHLWSIA LtJLQKYCHIR 60 

LAGSKDPRAY FKTKTWWLGL FLMLLGELGV FASYAFAPLS LIVPLSAVSV IASAIIGIIF 120 

IKEKWKPKDP LRRYVLSFVG CGLAWGTYL LVTFAPNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LWYKESCHAN NIWILLLVA LLGSMTWTV KAVAGMLVLS IQGNLQLDYP 240 

IFYVHFVCHV ATAVYQAAFL SQASQMYDSS LIASVGYTLS TTIAITAGAI FYLDPIGEDV 300 

45 LH1CMPALGC LIAFLGVFLI "FRURKKPIPF EPYISHDAKP GHQNHHDKGM TVQPBLKASF 360 
SYGALENNDN XSBIYAPAXL FVMQEEBGSH SASGVPYRVL EHTKKE 

SEQ D NO:83 POOS DNA SEQUENCE 

Nudelc Add Accession I: NMJD32712 
50 Coding sequence: 555-908 (undeifined sequences correspond to start and stopcodons) 

1 11 21 31 41 51 

I I I I I I 

CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCATOCC TCCAGACACT 60 

CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAQ GAAAGCGCTG CCACCCACCC 120 

AOCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAG AACTGCAACA GGCTCTCCTG GGTCCTGCAG GTGTACAGCC GGGCCCCTGC 240 

CTTGTGCCTC AGCTCTCGAG AGCTGCTGCT - GOCGGGTGAC CTGAICCAAC CTGATAAGGT 300 

GCCATCTTCA GCTACCACTG CAAGGCCCTG AGGGCAACAG CAGCACQGCA CTGCCCACCC 360 

GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTOGAGGC CACTGAGCCA 420 

CCCTTCCAGC CCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTCGTCCT 480 

CTGCGCCTGA GCCAGTGACG CCCAAGGACA TGCCTGTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

GCAGGCTGCT CTCCATGGTG CCAGGGCCCG CCAGGCCACC AGGCTCATGC TGGGACCCAA 660 

CCCAGTGCAC AAGGACTTGG CTGCTGAGCC ACACACCCAG GAGAAGOTGG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTCCCTAT TGTGACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGAOCTG GGCCTGGCTC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT CCAGGGGTGA AGCAGGCCAG AATCCTGGOG GAGCTGCTCC 900 

TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 960 

CTACCCTCCC TGTGGAGCTG TTCGGTGTCC GTCGAGCTAG CCACACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTGTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAACCAGG CCTTGTGAGT GGACACTGAC CATGAGTCCC TGGGGGGAGT 1140 

GATCCCCCAG GCATCGTGTG CCATCTTGCA CTTCTGCCCA GGCAGCAGGO TGGGTGGGTA 1200 

__ CCATGGGTGC CCACCCCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 
75 ACOCCACACC CTTGACATAA AACCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 

SEQ ID wthM PDOa Proteh seouerce 
Protein Accession I: NP.U6101 

80 1 11 21 31 41 51 
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I I I I I I 

MTVLEAVLEI QAXTGSRLLS MVPGPARPPG SCMDPTQCTR TWLLSHTPRR EWISGLPEAS 60 
CRLGEEPPPL PYCDQAiGEE LSIRHRE1UA WLSRTDTAWP GAPGVKQARI L6BLU1V 

SEQ ID NO:85 PDT1 ONA SEQUENCE 

Nuddc Add Accession #: NMJD0O693 

Coding sequence ' S3-1S91 (undeiOned sequences correspond to start and stop codas) 

1 11 21 31 ■ 41 51 

I I I I I I 

AGCCGGTGCG CCGCAGACTA G66CGCCT0G GGCCAGGQAG CGCGGAGGAG CCATGGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCSGCCC TGCCGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AfflTCACCAA GATATTTATC AACAATGAAT GGCACGAAXC 180 

CAAGAGTGGG AAAAAOTTTQ CTACAIGTAA CCCTTCAACT CGGGAGCAAA TATGTQAAGT 240 

GGAAGAAGGA GATAAGCOCG ACGTGGACAA GGCTGTGGAS GCTGCACAGG TTGCCTTCCA 300 

GAJ3GGGCTCG CCATGGCGCC 6GCTGGATGC CCTGAGTCGT GGGCGOCTGC TGCACCAGCT 360 

GGCTGACCTG GTGGAGAGGG ACCGCGCCAC CTTGGCCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTGT CTGTGGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATG CTGGTGTGGA AGCTGGCACC CGCCCTCTGC TGTGGGAACA CCATGGTCCT 660 

GAASCCTGCG GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

CGGGTTCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAG TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGGAAA 840 

ACTGGTTAAA GAAGCTGCGT GCOGGAGCAA TCTGAAGCG6 GT6ACGCTGG AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT OTGCGGACGC TGACTCGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGQAGTQ TTCTTCAACC AAQGCCAGTG TTGCACGGCA GCCTCCAOGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA AACGGCCCGT 1080 

GGGAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAGTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGG6 1200 

CTCAGCCATG GAAGACAAGO GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AAICTOQACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TTAAAATGTC 1S0O 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGOCGAA TACACAGAAG TOAAAACTOT 1560 . 

CACCATCAAA CTTGGCGACA AGAACCCCTQ AA GGAAAGGC GGGGCTCCTT CCTCAAACAT 1620 

CCCACGGCGG AATGTGGCAG ATGAAATGTG CT6GAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTCGAG TTGAATGATT GCTGTTTTCC 1740 

TCTCACTCTC CTGTTTATTC ACCA6ACTGG GGAXGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCT GC CTG GGGAGGGAGC TGTTGGCCAT TTCTGTGTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTSTTAAC AGGGAQTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAQ GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TOCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTGGTTTT T GTTTT T T G T TTTCTTGTTT TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTG CATACCGTGG AAGGGCGCCA GGGTCTTTGT GGATTGCATG TTQACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAAXACTGC CTTTGGAATA TGACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTTTC TTAAGCAGAA 2340 

AATATTGTTG AGGTTACCTT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTATCAC GTTCACIAAC AGCTTATGAT AAGTCTGTGT AGTCTTCCTT 2520 

TTCTCCAGTT CTGTTACCCA ATTTAGATTA GTAAAGGGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTATTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 27S0 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCAIATAGCT TCAAAAACAA AAACAAATGT 2820 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTO 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA ■ 3000 

ATTACCACAT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA CCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGAIAOTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCOG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGATA 3420 
AAATAAAATT GGATATTTGA OA 

SEP ID NO;86 PDT1 PROTEIN SEQUENCE 

Protein Accession* NPJQ00684 

1 11 21 31 41 51 

I I I I I I 

KATANGA VEN GQPDGKPPAL PRPIRNLEVK FTKIPINNEW HBSKSGKKFA TOJPSTREQI 60 

CEVEEGDKFD VDKAVEAAQV AFQRGSPKRR LDALSRGRLL HQLADLVERD RATLAALETH 120 

DTGKFFLHAP FIDLEGCIRT LRYFAGWADK IQGKTXPTDD NWCFTRHEP IGVCGAITPW 180 

NFPLLHLVWK LAPALCCGNT KVLKPAEQTP I/TALYLGSLI KBAGPPPGW KIVPGFGPTV 240 

GAAISSHPQI NKIAFTGSTE VGKLVKEAAS RSNLKRVTLE LGGKNPCIVC ADADLOIAVB 300 
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CAHQGVFFKQ GQCCTAASKV FVEEQVYSEP VKRSVBYAKK RPVGDPPDVK TEQGPQIDQK 360 

QPDKILBLIE SGKKEQAXLE CGGSAMEDKO LPIKPTVFSB VTCNHRIAKB BIFGFVQPIL 420 

KFKSIBBVIK RANSTDYGLT AAVFTKNLDK ALKLASALBS GTVWXNCYNA LYAQAPFGGF 480 
KHSOTOREI/3 EYALAEYTEV KTVTIKLGDK HP 

SEQ D NO:87 PDV3 DNA SEQUENCE 

Nuctelc Add Accession* NM.032642 

Coding sequence: 184-1263 (underlined sequences correspond to start and stop colons) 

1 11 21 31 11 SI 

I I I I I I 

GACCATTAGC AGGCACCCAG GCCTGTCTTT OOCTCGGAAA CGGTGGCCCC CAATGTAGCC 60 

tagtttgaac ctaggaactg caggaccaga gagattccac tggagcctga TGGACGGGTG 120 

ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGQGC ACTGGGGAjGG GCTGAGGCCG 180 

ACCATGCCCA GCCTGCTGCT GCTQTTCACG GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTGACBGAOG CCAACTOCTG GTGGTCATTA GCTTTGAACC GGGTGCAGAG ACCCGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAQT CAGCTTCCCQ GQCTCTCCCC TOGCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACATG GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

ASCAAGGAAT GCCACCACCA GTTCCCCCAS CGGCGGTGGA ATTGCAGCAC AGOGGACAAC 480 

GCATCTGTCT TTGGGAGA6T CATGCAGATA G0CAGCCGA3 AOACCGCCTT CACCCACGCG S40 

GTGAGOGCCG CGGGCGTGGT CAACGCCATC AGCGGGGCCT GGCGCGAGGG CGAGCTCTCC 600 

ACCTCCGGCT GCAGCCGGAC GGCGCGGCCC AAGGACCTGC OCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGQ ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

GAGCGAGAGA AGAACTTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG 780 

CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTG G C TGCAGCTGGC CGAGTTCCGC 900 

AAGQTCGGGG ACOGGCTGAA GGAGAAGTAC GACAGCGOGG CCGCCATGCG CGTCACCCGC 960 

AAGGGCCGGC TGGAGCTGGT CAACASCOGC TTCACCCAGC CCACCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCCGA CTACTGCCTG CGCAACGAGA GCACGGGCTC CCTGGGCACG 1080 

CAGGGCCGOC TCTGCAACAA GACCTCGGAO GGCATGGATG GCTGTGAGCT CATGTGCTGC 1140 

GGGCGTGGCT ACAACCACTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TCGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATCTGTAAA 1260 

TAGCCCGGAG GGCCTGCXCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGQAGATCT CTGAGGAOTG 1440 

GACTTTGCTO GTTCTCTCCT CTTGGTGGGT GGCAGACAGG GCTTTTTCTC TOCCTCTGGC 1500 

GAGGACTCTC AGGATGTAGG 6ACTTGGAAA TATTTACTGT CTGTOCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAGAG GCCCTATGAA GGTGGCGGGA 1680 

ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA AK3TAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA 1860 

GAGATCTGCA CCTCOOGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

GGTTCACTAG CTCCTACCTG AAGAGGAAAG GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

OCCTAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCOGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TG6CGGG0CC 2160 

GGCGTGCTCA TCATCTCTGC CCCAGGTGTA CGGTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 

SEQ ID U0:8a FPV3 Protein sequence 
Protein Accession* NP_U6031 

1 11 .21 31 41 51 

I I I I I I 

MPSLLLLFTA ALLSSHAQLL TDANSWWSLA LNPVQRPEMF IIGAQPVCSQ LPGLSPGQRK 60 

LCQLYQEHMA YIGEGAKTGI KECQBQFRQR RWNCSTADHA SVFGRVHQIG SRSTAPTHAV 120 

SAAGWNAIS RACREGELST CGCSRTARPK DLPRDWLWGG CGDNVEYGYR FAKEFVDARE 180 

REKNFAKGSB EQGRVLHNLQ NNBAGRRAVY KMABVACKCH GVSGSCSLKT CWLQLAEFHK 240 

VGDRLKEKYD SAAAHRVTRK GRLELVNSRF TQPTPEDLVY VDPSPDYCLR NESTGSLGTQ 300 
OHLCNKTSBG MDGCELHCCG RGYNQPKSVO VERCHCKFHW CCFVSCRKCT BIVDQYICK- 

SEQIDNO:89 PDT9 UNA SEQUENCE 

Nucleic Add Accession I: NM_033280 

Cwtag sequence: 58*38 (uraleital sequences amespond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCTATG 60 

GTGCGTGCGG GCGCCGTGGG GGCTCATCTC CCCGCGTCCG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTCGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTGTTCCTC 300 

ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

CGAGACATTC CAATAGTTCA CAGAGTAATC AAAGTTCATG AAAAAGATAA TGGAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATOAA GTTGATOATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGGTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA AGTATGCTCT TTTGGCTGTA 600 

ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATO AGAAGCACTT CCTGGGACCA 660 

GATTGAAATG AATTCTGTTG AAAAAQAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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40 
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TGTATAAAAQ GGAACAGTGT GQAQATGTTT TTGTCTTCTC CAAATAAAAG ATTCACCAGT 780 
AAAAAAAAAA AAAA 

SEQ ID HOap PDT9 Protein sequence 
Protein Accession* NP.1S0S96 

1 11 21 31 41 51 

I I I I I I 

MVBAGAVGAH LPASGUDIFG DLKKMNKRQL YYQVLNFAHT VSSMJOKKG LIVLTGSESP 60 
IWVLSGSHE PAFHRGDLLF LTNFREDPIR AGEIWFKVE GRDIPIVHHV IKVHEKDNGD 120 
IKPLTKGDNN EVDDRGLYKE GQNWLEKKDV VGRARGFLPY VGKVTIIHND YPKFKYALLA 180 
VMGAYVLLKR ES 



, _ SEQ 10 NOdlt POV5 0NA SEQUENCE 

15 NudefcAdd Accession f. NMJQ16390 

Codhg sequence: 631-975 (untatined sequences correspond to slat and stop radons) 

1 11 21 31 41 51 

on I .1 I I I • 

ZU GATTACTCAC ACAGTCTTGA AGATGCAATG TCAGCTATTT AGGACAGAAA CATCCAAGGC 60 

CGTGTCAGAA CTCAATTACO ACTACATATQ CATTAAGGCA GGAACT6GCA GGCCTCAGGG 120 

TACGCCAACT ATAGGACTC6 TGCTTCTCGT ACGCTGGGCT ATAATCTATG AAACTGAGCT 180 

CCAGAGCCAO CCAATCACTT AGCTCCTCAT AACAAGTCTA ACTGGCTCTG GAAAGCTGAA 240 

_ AG66CTGCAC TG6AACAACA CAGATGAGAT ATTCTACACA TTAATCTACT TATCTGGAAT 300 

25 CACTTTGCCT CTAAAGGCCA GAGAAAAATC ACAGCTTCCT TGTCGGAGGG GAAAAGGACA 360 

GGTGATCTGO GGAAAACGCA GCTACACCTG GAGCAAGGTC TCTTCCOGGC TTGGCAATCT 420 

CAGCTCTGCC GGCGCTACGG GACCCGAGCC GTCCCAGAAA CCAAAGGGCA GGCACGGCAG 480 

CAAACGCCTQ AGTGCTGCTG GCTTCGGT6A CTATATGAGA ATOOAAACTT CTAAGGAAGC 540 

_ _ CAGGTTGTTA CAATTGTTAC CCCCTTTACT CAGAGATAAC ATA6ATTATC CAGGCTGAGA 600 

30 TGGAAAACAA GCCCTTTATT GAATTTTCAA CACAGACTCC CTGCTTCTCA TCTCCTTAAT 660 

AAAATTTCAT TAAAATCOCC TTGAACTOCC ATOT TCAAAT CTCCATTTGT TGACAGACAA 720 

AGCCAACAAT ACTCTAAACT GAGGCCTGCA AGTCATTTCA TTTGTATTTT TGTOCAGAAA 780 

TTTCCCATAG GAAGACTTCA CCTOCTACAA CTCCGAAGAA AAOCCTTACT GTCCAAGACC 840 

_ GTCACCAGCA ACCATCCGCA GTCATTCAAG TGGAAGCTTT CACAGCTTTT GTACATTCTC 900 

35 TCTGTCAATA TACAACEGAQ TTACAGACTG TCCCCTGGCT CCCTGACCCT TACAAACACT 960 

AAAAGTTTTG TTTGACTCAA CTTCAAGCTG CTCATCTGTT AQTAAGTGAT GTTCACTCCA 1020 

GAACACATTC ATGATGAGAA CTTTCTAAAA GACCAGCACT GCTCTTCCCC TCCTATAATC 1080 

ATAATAATCA TGATAACCTG AAACATGTTA CTGGGACTCG ACATTTTTCT GGGGATTGAA 1140 

ATCTTTAGTC CTTGGAGCTG TCACATAGCA GGGGCAACCT CACACTGAAA CAAAGGAAGT 1200 

GATGTCCCAT TATTATCCAC CCTGAGCCAC CATAATATGC TGTTTACATT TATTTTCTTC 1260 

AGCCTGTGCA AAACAAAGCA ATGGAAAAGG AAACTAAAAA ATATACATAC TAGTACCATT 1320 

ATCTTCTTTT GCCTAAAATT ACTAATGCAC CACGTCAGTC TGCTTCCTTC AGGCATCATT 1380 

CTCAATTCAT CAGGACTTGT ATTAGCAGGT TCTGGCTAGA GAGACTATCT CCTGTCATCA 1440 

OGATCAATTA ATGTTTTCTG GTGATCACAT CAGGCCCTAT CTAAGAAGCT CATGGTAXAC ISOO 

AAGGGTCACC CAAATAGCTG AGTGCAGTCC TTGCTCATAT TTOCTTCATC TTAACCCCGC 1S60 

AAACAAGAAT TAAGATGATC CCAATAAAAG AAAAATTGCT CAGGAAACTG AACCTTTTTC 1620 

TGAACCAAGC ACTGTCAGCA AATCTCAGGT ATTAGAGCAA CTATGGTTGA TTGAAAAGTG 1680 

TCTCAAAATC TGGGOCAAGA ATGATTGCTA GGTCCATAAG CTAATTTGTC TGGCCTTGCC 1740 

ATTTACGTAA GCCAAAGAAA CTCACTCATG AGTAAACTAT AGAAAACGTT CAGACCCATC 1800 

CTGTTAGTAT GTCAAATCAA CTAAGACTGG CAGGGTATTA ACTCCATTCC AGGTGACATG 1860 

GASAAAGAGC CCCATTATTT TCACAGTGCC A G CCT CTA CC TAAGGAAACC CTAGACCTTG 1920 

GAACCAGTTT CCTGGTAGGG AACTGCTGAC AGTTTCAATG CTGACAGTTG GAGOCAATGC 1980 

CTCATAGTGT AAACTGAAAG AAAAATAGTT GCTTTTTAAA ATGTCAGCAA GAAGGCCTGC 2040 

CTCATCTTAA CAAAGCAAAA AAAAATGCTT TAATTCAAAT TAAAAATCAT GATACTAAAA 2100 
AAAAAAAA 

SEQ 10 NO:92 PDV5 Protein secuence 
Protein Accession f: NP_0J7674 

60 1 11 21 31 41 51 

I I I I I I 

MQCQLFRTET SKAVSELNYD YICIKAGTGR PQGTPTIGLV LLVRWAHYE TELQSQPIT 

SEQ ID H033PEE6DNA SEQUENCE 

65 Nucleic Add Accession f. NMJ02606 

Cooing sequence: 61-1842 (underlined sequences correspond lo start and slop codorts) 

1 11 21 31 41 51 

nn I I I I I I 

/U CGCGGCGGCT GGCGTCGGGA AAGTACAGTA AAAAGTCCGA GTGCAGCCGC CGGGCGCAGG 60 

ATGGGATCCG GCTCCTCCAG CTAOCGGCCC AAGGCCATCT ACCTGGACAT CGATGGACGC 120 

ATTCAGAAGG TAATCTTCAG CAAGTACTGC AACTCCAGCG ACATCATGGA CCTGTTCTGC 180 

ATCGCCACCG GCCTGOCTCG GAACACGACC ATCTCCCTGC TGACCACCGA CGACGCCATG 240 

_ GTCTCCATCG ACCCCACCAT GCCCGCGAAT TCAGAACGCA CTCCGTACAA AGTGAGACCT 300 

75 GTGGCCATCA AGCAACTCTC CGCTGGTGTC GAGGACAAGA GAAOCACAAG CCGTGGCCAG 360 

TCTGCTGAGA GACCACTGAG GGACAGACGG GTTGTGGGCC TGGAGCAGCC CCGGAGGGAA 420 

GGAGCATTTG AAAGTGGACA GGTAGAGCCC AGGCCCAGAG AGOOCCAGGG CTGCTACCAG 480 

GAAGGCCAGC GCATCCCTCC AGAGAGAGAA GAATTAATCC AGAGCGTGCT GGCGCAGGTT 540 

GCAGAGCAGT TCTCAAGAGC ATTCAAAATC AATGAACTGA AAGCTGAAGT TGCAAATCAC 600 

80 TTGGCTGTCC TAGAGAAAOG CGTQGAATTG GAAGGACTAA AAGTGGTGGA GATTGAGAAa' 660 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



TGCAAGAGTG 
TGCCCCTGTA 
CCCACTTACC 
STTGACGTCT 
GACCTCGGGC 
TGTGTCCACG 
GCCCAGATGA 
GATATCCTGA 
AACACGTACC 
CTGGAGAACC 
TTCTCCAACA 
TTGGCCACTG 
AATTTTGACT 
TGTCATATCT 
TTAGAGGAAT 
TTCATGGACC 
CTGATCCCAA 
CAGCCACTTT 
AAAGAGTTAC 
AGAAGCAGAG 
CTGCAGTTCT 
TGGGCACCTG 
AAAAAAAAAA 



ACATTAAGAA 
AGTACAGTTT 
CCAAGTACCT 
GGCTTTGGGA 
TCGTCAGGGA 
ACAACTACAG 
TGTACASCAT 
TCCTAATGAC 
AGATCAATGC 
ACCACT6C6C 
TCCCACCTGA 
ACATGGCAAG 
ACAGCAACCA 
CTAACGAGGT 
ATTTTATGCA 
GACACAAAGT 
TOTTTGAAAC 
GGGAATCCCG 
AGAAGAAGAC 
ATGTGAAAAA 
GGACGGGCTG 
GCACCACAAG 
A 



GATGAGGGAG 
TTTGGATAAC 
GCTCTCTCCA 
GCCCAATGAG 
CTTCAGCATC 
AAACAACCCC 
GGTCTGGCTC 
AGCGGCCATC 
CCGCACAGAG 

TGGGTTCAAG 
ACATGCAGAA 
GGAGCACATG 
CCGTOCAATG 
GAGCGAOCGT 
GACCAAGGCC 
AGTGAOCAAG 
AGATOGCTAC 
TGACAGCTTG 
CAGTGAAGGA 
GCCGAGCTGC 
ACCATGTTTT 



GAGCTGGGGG 
CACAAGAAGT 
GAGACCATCG 
ATGCIGAGCT 
AACCCTGTCA 
TTCCACAACT 
TGCAGTCTCC 
TGCCACGATC 
CTGGCGGTCC 
CAGATCCTCG 
CAGATCCGAC 
ATTATGGATT 
ACOCTGCTGA 
GAAGTCGCAG 
GAGAAGTCAG 
ACAGCCCASA 
CTCTTCCCCA 
GAGBAGCTGA 
AOGTCTGGGG 
GACTGTGCCT. 
GCGGGATCCT 
CTAAGAACCA 



SEQ m N034 PEES Protein sequence 
Protein Accession* NPJ002597 



KGSGSSSYRP 
VSIDPTMPAN 
GAFESGQVEP 
LAVLEKRVEL 
PTYPKYLLSP 
CVHMWRHNP 
NTYQINARTE 
LATDKARHAE 
IEEYFKQSDR 
QPLWESRDRY 



11 
I 

XAXYLDXDGR 
SERTPYKVRP 
RPREPQGCYQ 
EGLKWEIEK 
ETIEALRKPT 
FHNFEKCFCV 
LAVRYKDISP 



21 • 
I 

IQKVIPSKYC 
VAXKQLSAGV 
EGQRIPPERE 
CKSDIKKHRE 
FDVWLWEPNE 
AOMKrSMVWL 
LENHHCAVAF 



EKSEGLPVAP 
EELKRIDDAM 



FMDRDKVTKA 
XELQKKTDSL 



31 
I 

NSSDMDLFC 
EDKftTTSRGQ 
ELIQSVLAQV 
ELAARSSRTN 
HLSCLEHMYH 
CSLQEKFSQT 
QILABPECNI 
TLLKMILIKC 
TAQIGFIKPV 
TSGATEKSRE 



CCAGAAGCAG 
TGACTCCTCG 
AGGCCCTGCG 
GCCTGGAGCA 
CCCTCAGGAG 
TCCGGCACTG 
AGGAGAAGTT 
TGGACCATCC 
GCTACAATGA 
OCGAGCCTGA 
AGGGAATGAT 
CTTTCAAAGA 
AGATGATTTT 
AGCCTTGGGT 
AAGGOCTTCC 
TTGGGJTTCAT 
TGGTTGAGGA 
AGCGGATAGA 
CCACCGAGAA 
GAGGAAAGCG 
TGTGCAGGGA 
TTTTGTTCAC 



41 
I 

IATGLPBNTT 
SAERPL8DRR 
AEQFSHAPKI 
CPCKYSFLDK 
DLGLVRDFSI 
DILILMTAAI 
FSNIPPDGPK 
CDISNEVRPH 
LIPMFETVTK 
RSRDVKNSEG 



CAGGAOCAAC 
ACGCGATGTT 
GAAGCOGACC 
CATGTAGCAC 
GTGGCTGTTC 
CTTCTCCGTG 
CTCACAAACQ 
CGGCTACAAC 
CATCTCACCG 
GTGCAACATC 
CACATTAATC 
GAAAATGGAG 
GATAAAATGC 
GGACTGTTTA 
TGTGGCACCG 
CAAGTTTGTC 
GATCATGCTG 
TGACGCCATG 
GTCCAGAGAG 



AGAGCTGCCC 
TGAXACAAAA 



51 
I 

ISLLTTDDAM 
WGLEQPHRE 
NELKAEVAHH 
HKKLTPRRDV 
NPVTLRRWLF 
CHDLDHPGYN 
QIRQGMITLI 
EVAEPWVDCL 
LFPMVEEIML 
DCA 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 




CTATTTTGGG 
GCTGGTGGGC 
GAATTATCCA 
CTCACCCGTG 
TGGTGGCTGA 
TAAAGCTTCT 



11 
I 

CGAGAGCCYT 
GGGGGCCCCA 
ACTGAAGACT 
AAAAGTGICC 
AATACCCAGG 
CAAGCTTTCC 
AAGCTTGATG 
ATCTATGGCC 
CTAGAGGAGC 
GGTCGCTAGG 
TGGTACTGGA 
GCAGAATCAG 



21 
I 

GGGATGCACC 
CCTGGGCAGG 
ACGACCATGA 
AGGTGAAACT 
AAGTCACCCT 
TCCGGGGTAT 
GCCAGAICTC 
AGTATCAACT 
CGAOCACTGA 
GTGGGGTATG 
GTAACTGAGT 
TGAAAAAAAA 



31 
I 

GGCCAGAGGC 
GAAGATGTAT 
AATCACAGGG 
TGGAGACTCC 
GCAGCCAGGC 
GGTCATGTAC 
CTCTGOCTAC 
CCTTGGCATC 
GCCACCAGTT 
GGSCCATCCG 
CGGGAOGCTG 
A 



41 
I 

ATGCTGCTGC 
GGCCCTGGAG 
CTGCGGGTGT 
TGGGACGTGA 
GAATACATCA 
AOCAGCAAGG 
CCCAGCCAAG 
AAGAGCATTG 
AATCTCACAT 
AGCTGAGGCC 
AATCTGAATC 



51 

I 

TGCTCACGCT 
GAGGCAAGTA 
CTGTAGGTCT 
AACTGGGAGC 
CAAAAGTCTT 
ACCGCTATTT 
AGGGGCAGCT 
GCTTTGAATG 
ACTCAGCAAA 
ATCTGTGTGG 
CACCAATAAA 



SEQ m NQ-.96 PEG4 Prolein semienca 
Protein Accession I: FGENESH predicted 

I ll 21 31 41 51 

II I I I I 

HLLLLTLALL GGPTWAGKMY GPGGGKYFST TEDYDHEITG LRVSVGLLLV KSVQVKLGDS 
WDVKM3ALGG NTQEVTLQPG EYITKVFVAF QAPLRGHVHY TSKDRYFYFG KLDGQISSAY 
PSQEGQVLVG IYGQYQLLGI KSIGFEHNYP LEEPTTEPPV NLTYSANSPV GR 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID NO:95 PEG4 0NA SEQUENCE 

Nucleic Add Accession #: none 

Coding sequence: 41-559 (underlined sequences correspond to start and slop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



60 
120 



70 
75 
80 



SEQ ID N0:S7 PEL9 DNA SEQUENCE 

Nucleic Add Accession*: NMJW6953 

Codbig sequence: 334£6(unoeruned sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



I I I I I I 

CCGTTCCGCG CTCTGCCGGC TCCTOCCGGG CGATGCCTCC GCTCTGGGCC CTGCTGGCCC 60 

TCGGCTGCCT GCGGTTCGGC TCGGCTGTGA ACCTGCAGCC CCAACTGGCC AGTGTGACTT 120 

TCGOCACCAA CAACCCCACA CTTACCACTG TGGCCTTGGA AAAGCCTCTC TGCATGTTTG 180 

ACAGCAAAGA GGCCCTCACT CGCACCCACG AGGTCTACCT GTATGTCCTG GTCGACTCAG 240 

CCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC CCCACTGGGC TCAACGTTCC 300 
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TACAAACAGA GGGTGGGAGG ACAGGTCCCT ACAAAGCTGT GGCCTTTGAC CTGATCOCCT 360 

GCAGTGACCT GCCCAGCCTG GATGCCATTG GG6ATGT0TC CAAGGCCTCA CAGATCCTGA 420 

ATGCCTACCT GGTCAGGGTG GGTGCCAAC6 GGACCTGCCT GTGGGATCCC AACTTCCAGO 480 

GCCTCTGTAA CGCACCCCTG TCGGCAGCCA CGGAGTACAG GTTCAAGTAT GTCCTGGTCA 540 

ATATGTCCAC GGGCTTGGTA GAGGACCAGA CCCTQTGGTC GGACCCCATC CGCAOCAACC 600 

AGCTCACOCC ATACTCGACG ATCGACACGT GGCCAGGCOG GCGGAGCGGA GGCATGATCG 660 

TCATCACTTC CATCCTSSGC TCCCTGCCCT TCTTTCTACT TGTGGQTTTT GCTGGOGCCA 720 

TTGOCCTCAG CCTCGTGGAC ATGGGGAGTT CTGATGGGGA AACGACTCAC GACTCCCAAA 780 

TCACTCAGGA GGCTGTTCCC AAGTCGCTGG GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

ACCGGGGGCC GCCACTGSAC AGGGCTOAGO TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

AGCAOCACCC CTGGGCAGCA GCATCCTCCT CTCTGGCCTT GCCOCAGGCC CTGCAGCGGT 960 

GQTTGTCACA CCCTGACTTC AGQGAAGGTG AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AACCCTTAAT AAAATCTTCT GATGAGTTCT AAAAAAAAA 

SEQID N&98 PELS Protein sequence 
Protein Accession*: NPJ008884 

1 11 21 31 41 51 

I I 1 I I I 

KPPLWALUU* GCLRFGSAVN LQPQLASVTF ATNNPTLTTV ALEKPLCMFD SKBALTGTHG 60 

VYLYVLVDSA ISRNASVQDS TNTPLGSTFL QTEGGRTGPY KAVAFDL1PC SDLPSLDAIG 120 

DVSKASQILN AYLVRVGABG TCLWDPNFQG LCNAPLSAAT EYRFKYVLVN HSTGLVEDQT 180 

LWSDPIRTNQ LTPYSTIDTW PGRRSGGMIV ITSILGSLPF FLLVGFAGAI ALSLVDMGSS 240 
DGETTKDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRAEV YSSKLQD 

SEQ 10 NO-.99 PEN1 DNA SEQUENCE 

Nucleic Add Accession*: NMJDI2391 

Coding sequence: 416-1423 (underlined sequences correspond to stet and slop codons) 

1 11 21 31 41 51 

I I I I I I 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

AGTCCTCCAA GCCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

CCAGGAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC 180 

TCAGAGGGCC AOCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGCCTCT 240 

GCCACCAGCC CTGCTGGCCC CTGGTTCCGC TGGCCCGCCA GATGCCTGGC TGAGACACGC 300 

CAGTGGCCTC AGCTCCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AGCCGCCAGC CCAAACAGCA GCGGCATCGG 420 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGGTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTTCT ACCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CT6GGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

GGGCAGCCTG GACTTGGTGC CCGGCGGGCT GACCTTGGAG GAGCACTCGC TGGAGCAGGT 780 

GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCOCATGG ACTGGAGCCC CASCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCACCAA TACCGGCTGC CCCCCATSGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCOQ CCAGCGCTCG CCOCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGCGGA CTTCACCTGG 1080 

GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACCGACAGCG AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGOCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

CAICTCCCAQ CGCCTCGTCT ACCAGTTCGT GCACCCCATC TGAGTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC OCTCAGGGGC CTCTCTCCTG CCTGCCCTGC CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGACCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

GTGCTTCCTC CTCAGGCCCA GCTGCTCCCC TGGAGGACAG AGGGAGACAG GGCTGCTGCC 1680 

CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAA6GG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCCCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 
CCCGGGAATG GATAATAAAG ATACTAGAGA ACTG 

SEQ ID NO-.100 PEN1 Protein sequence 
Protein Accession*: NPJB6523 

1 11 21 31 41 51 

I I I I I I 

MGSASPGLSS VSPSHLLLPP DTVSRTGLEK AAAGAVGDEH RDWSPSPPAT PEQGLSAFYI* 60 

SYFDKLYPED SSWAAJCAPGA SSREEPPEEP EQCPVTDSQA PAGSLDLVPG GLTLEEHSLE 120 

QVQSMWGEV LKDIETACKL LNITADPHDW SP5NVQKHLL WTEHQYRIiPP MGKAFQELAG 180 

KBLCAMSEEQ FRQR5FLGGD VLHAHLDIWK SAAWMKERTS PGAIHYCAST SEBSWTDSEV 240 

DSSCSGQPIH LWQPLKELLL KPHSYGRPIR WLNKEKGIFK IEDSAQVARL WGIRKNRFAK 300 
NYDKLSRSIR QYYKXGIIRK FDISQRLVYQ PVHPI 

SEO 10 NftlOl PENS DNA sequence 

Nucleic Acid Accession*: NMJD00742 

Coring sequence: 555-2144 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

I I I I I I 

GAGAGAACAG CGTGAGCCTG TGT SC TTGTG TGCTGAGCCC TCATCCCCTC CTGGGGCCAG 60 

GCTTGGGTTT CACCTGCAGR ATCGCTTGTG CTGGGCTGCC TGGGCTGTCC TCAGTGGCAC 120 

CTGCATGAAG CCGTTCTGGC TGCCAGAOCT GGACAGCCCC AGGAAAACCC ACCTCTCTGC 180 

AGAGCTTGCC CAOCTGTCCC CGGGAAGCCA AATGCCTCTC ATGTAAGTCT TCTGCTCGAC 240 

GGGGTGTCTC CTAAACCCTC ACTCTTCAGC CTCTOTTTGA CCATGAAATO AAGTGACTGA 300 

GCTCTATTCT GTACCTGCCA CTCTATTTCT CGCQT6ACTT TTOTCAGCTG CCCAGAATCT 360 

CCAAGCCAGG CTGGTTCTCT GCATCCTTTC AATGACCTGT WICTTCTQT AACCACAGGT 420 

TCGGTGGTGA GAGGAAGCCT CGCAGAATOC AGCAGAATCC TCACAGAATC CAGCAGCAGC 480 

TCTGCTGGGQ ACATGGTCCA TGGTGCAAOC CACA0CAAA6 CCCTGACCTG ACCTCCTGAT 540 

GCTCAGGAGA AGCCATGGGC CCCTCCTGTC CTGTGTTCCT GTCCTTCACA AAOC7CAGCC 600 

TGTGGTGGCT CCTTCTGACC CCAGCAGGTG GAGAGGAAOC TAAGCGOCCA CCTCCCAGGG 660 

CTCCTGGAGA COCACTCTCC TCTCCCAGTC OCACGQCATT GCCGCAGGGA GGCTCGCATA 720 

CCGAGACTGA GGACCGGCTC TTCAAACACC TCTTCCGGGG CEACAACCGC TGGGCGCGCC 780 

CGGTSCCCAA CACTTCAGAC GTGGTGATTG TGOGCTTTGG ACTGTOCATC GCTCAGCTCA 840 

TCGATGTGGA TCAGAAGAAC CAAATGAT6A CCACCAACGT CTGGCTARAA CAGQAQTGGA 900 

GCQACTACAA ACTGCGCTGG AACCCCGCTG ATTTTGGCAA CATCACATCT CTCA6G6T0C 960 

CTTCTGAGAT 6ATCTGGATC CCCGACATTO TTCTCTACAA CAATGCA6AT GG6GA0TTT6 1020 

CAQT6ACCCA CATGACCAAQ GCCCACCTCT TCTCCACGGG CACTGT6CAC TGGOTGCCOC 1080 

CGGCCATCTA CAAGAGCTCC TGCAGCATCG ACCTCACCTT CTTCCCCTTC GACCAGCAGA 1140 

ACTGCAAGAT GAAGTTTGGC 7CCTGGACTT ATGACAAGGC CAAGATCGAC CTGGAGCAGA 1200 

TGGAGCAGAC TOTGOACCTO AAGGACTACT GGQAGAGCGG CQAGT6GG0C ATC6TCAATQ 1260 

CCACGGGCAC.CZACAACAQC AAGAASTOCG ACTGCTGOGC CGAGATCCAC CCCGACGTCA 1320 

CCTACGCCTT CGTCATCCGG CGGCTGOCGC TCTTCTACAC CATCAACCTC ATCATCCCCT 1380 

GCCTGCTCAT CTCCTGCCTC ACTGTGCTGG TCTTCTACCT GCCCTCCGAC TGCGGCGAGA 1440 

AGATCACGCT GTGCATTTCG GTGCTOCTGT CACTCACCGT CTTCCTGCTG CTCATCACTG 1500 

AGATCATCCC GTCCACCTCG CTGGTCATCC CGCTCATCGG CGAGTACCTG CTGTTCACCA 1560 

TGATCTTCGT CACCCTGTCC ATCGTCATCA CCGTCTT06T GCTCAAT6TG CACCACCGCT 1620 

CCCCCAGCAC CCACACCATG CCCCACTGGG TGCGGGGGGC CCTTCTGGGC TGTQTGCOCC 1680 

GGTGGCTTCT GATGAACCGG CCCCCACCAC OCGTGGAGCT CTGCCACCCC CTACGCCTGA 1740 

AGCTCAGCCC CTCTTATCAC TGGCTGGAGA GCAACGTGGA TGCCGAGGAG AGGGAGGTGQ 1800 

TGGTGGAGGA GGAGGACAGA TGGGCATGTG CAGGTCATGT GGCCCCCTCT GTGGGCACCC 1860 

TCTGCAGCCA CGGCCACCTG CACTCTGGGG CCTCAGGTCC CAAGGCTGAG GCTCTGCTGC 1920 

AGGAGGGTGA OCTGCTGCTA TCACCCCACA TGCAGAAGGC ACTGQAAGGT QTGCACTACA 1980 

TTGCCGACCA CCTGCGGTCT GAGGATGCTG ACTCTTCGGT GAAGGAGGAC TGGAAGTATS 2040 

TTGCCATGGT CATCGACAGG ATCTTCCTCT G G CT G TT T AT CATCGTCTGC TTCCTGGGGA 2100 

CCATCGGCCT CTTTCTGCCT CCGTTCCTAG CTGGAATGAT CTGACTGCAC CTCCCTCGAG 2160 

CTGGCTCCCA GGGCAAAGGG GAGGGTTCTT GGATGTGGAA GGGCTTTGAA CAATGTTTAG 2220 

ATTTGGAGAT GAGCCCAAAG TGCCAGGGAG AACAGCCAGG TGAGGTGGGA GGTTGGAGAG 2280 

CCAGGTGAGG TCTCTCTAAG TCAGGCTGGG GTTGAAGTTT GGAGTCTGTC CGAGTTTGCA 2340 

GGGTGCTGAG CTGTATGGTC CAGCAGGGGA GTAATAAGGG CTCTTCCGGA AGGGGAGGAA 2400 

GCGGGAGGCA GGCCTGCACC TGATGTGGAG GTACAGGCAG ATCTTCCCTA CCGGGGAGGG 2460 

ATGGATGGTT GGATACAGGT GGCTGGGCTA TTCCATCCAT CTGGAAGCAC ATTTGAGCCT 2520 

CCAGGCTTCT CCTTGACGTC ATTCCTCTCC TTCCTTGCTG CAAAATGGCT CTGCACCAGC 2580 

CGGCCCCCAG GAGGTCTGGC AGAGCTGAGA GCCATGGCCT GCAGGGGCTC CATATGTCCC 2640 
TACGCGTGCA GCAGGCAAAC AASA 

SEP ID KO:102 PEN3 Protein seouence 
Protein Accession* NPJM0733 

1 11 21 31 41 51 

I I I I I I 

MGPSCPVPLS FTKLSLWWLL LTPAGGEEAK RPFPRAFGDF LSSPSPTALP QGGSHTETED 60 
RLFKHLFRGY NRWARPVPNT SDWIVRFGL SIAQLIDVDE KNQMHTTNVW LKQEWSDYKL 120 
RWNPADFGNI TSLKVPSEMI WIPDIVLYNN ADGBFAVTHM TKAHLFSTGT VHWVPPAIYK 180 
SSCS1DVTFP PPDQQNCKMK FGSWTVDKAK IDLEQHEQTV DLKDVWESGB WAIVNATGTY 240 
tfSKKYDCCAE IYFDVTZAFV XBRLPLFYTI NLIIPCLLIS CLTVLVFYLP SDCGEKITLC 300 
ISVLLSMVF LLLITEIIPS TSLVIPLIGB YLLFTMIFVT LSIVITVPVL MVHHRSPSTH 360 
TMPHWVEGAL LGCVPRWLLM NRPPPPVEIjC KPLRLKLSPS YHWLESNVHA EEREVWEEE 420 
DRWACAGHVA PSVGTLCSHG HLHSGASGPK AEALLQEGEL LLSPHMQKAL EGVHYIADHL 480 
RSEDADSSVK EDWKYVAMVI DRJFLWLFII VCFLGTIGLF LPPFLAGMI 

SEO ID NO:103 PEU4 DNA SEQUENCE 

Nuctec Add Accession #: NM_0 18670 

Coding sequence: 87-893 (undefined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

CACGAGGCTG GAAGGGGCCA CTTCACACCT CGGGCTCGGC ATAAAGCGGC CGCCGGCCGC 60 

CGGCCCCCAG ACGCGCCGCC GCTGCCATGG CCCAGCCCCT GtGCCCGCCG CTCTCCGAGT 120 

CCTGGATGCT CTCTGCGGCC TGGGGCCCAA CTCGGCGGCC GCCGCCCTCC GACAAGGACT 180 

GCGGCCGCTC CCTCGTCTCG TCCCCAGACT CATGGGGCAG CACCCCAGCC GACAGCCCCG 240 

TGGCGAGCCC CGCGCGGCCA GGCACCCTCC GGGACCCCCG CGCCCCCTCC GTAGGTAGGC 300 

GCGGCGCGCG CAGCAGCCGC CTGGSCAGCG GGCAGAGGCA GAGCGCCAGT GAGCGGGAGA 360 

AACTGCGCAT GCGCACGCTG GCCCGCGCCC TGCACGAGCT GCGCCGCTTT CTACCGCCGT 420 

CCGTGGCGCC CGCGGGCCAG AGCCTGACCA AGATCGAGAC GCTGCGCCTG GCTATCCGCT 480 
ATATCGGCCA CCTGTCGGCC GTGCTAGGCC TCAGCGAGGA GAGTCTCCAG CGCCGGTCCC ' 540 

GGCAGCGCGG TGACGCGGGG TCCCCTCGGG CCTGCCCGCT GTGCCCCGAC GACTGCCCCG 600 

CGCAGATGCA GACACGGACG CAGGCTGAGG GGCAGGGGCA GGGGCGCGGG CTGGGCCTGG 660 
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TATCCGCCGT CCGCGCCGGO G O GICCTG G G GATCCCCGCC TGCCTGCCCC GGAGCCCGAG 720 

CTGCACCCGA GCCGCGCGAC CCGCCTGCGC TGTTCGOCGA GOCGGCGTGC CCGGAAGGGC 780 

AGGCGATGGA GCCAAGCOCA CCGTCCCCGC TCCTTCCCGG CGACGTGCTG GCTCTGTTGG 840 

AGACCT6GAT GOCCCTCTCO CCTCTGGAGT GGCTGCCTGA GGAGCCCAAG TGACAAGGGA 900 

CAACTGACGC CG T CTCTGTQ AGCACCQAGO CTTTTTGGCC TCAGCACCTT CGAAGTSOTT 960 

CCTTGGCAGA CTGCCTTTCC TGGAASAGGG CACGGGCGAT COOGACGGGG GCATTCCTGC 1020 

GGGTGAGAGC CGTCCCCACC GCGGCGGCCC TTCTCAOCCC CTCCCTCCAT GGAGGGACCC 1080 

ATAGGGCTAG ACACTTTGAG GCAAGCAGGA GOCTCTGCCT AATGTGAATT TATTTATTTG 1140 
TGAATAAACT GTACTGGTCT CAAAAAAAAA AAAAAAAAAA A 

SEQ ID N0:t04 PEU4 Protein samence 
Protein Accession!: NP_06U40 

1 11 21 31 41 51 

I I I I I I 

HAQPIiCPPLS ESWMLSAAWG PTKRPPPSBK DCGRSLVSSP DSMGSTPADS PVASPARPGT 60 
LRDPRAPSVG KRGARSSPXG S6QRQSASER EtORHRTLAR ALHBLRKFLP PSVAPAGQSL 120 
TKIETLRLAI RYIGHLSAVL GLSEESLQRR CRQRGEACSP HGCPLCPDDC PAOJHQTRTQA 180 
EGOGOGRGLG LVSAVKAGAS WGSPPACPGA RAAPEPRDPP ALFAEAACPE GQABEPSPPS 240 
PLLPGDVLAL LBTWHPbSPL EWLPEBPK 

SEOtDNMOS PEUSDNASEQUEMCE 

Nucleic Add Accession I: NM.017636 

Cooing sequence: 324-3374 (underBited sequences correspond to start and stop radons) 



1 11 21 31 41 51 

I 1 I I I I 

CCACGGAGAA GCCCACCGAT CCCTACGOAG AGCTGOACTT CACGGGGGCC GGCCGCAAGC 60 

_ A ACAGCAATTT CCTCCGGCTC TCTGACCGAA CGGAICCAGC TGCAGTTTAT AGTCTGGTCA 120 

30 CACGCACATG GGGCTTCCQT GCCCCGAACC TGGTGGTGTC AGTGCTGG6G GQATCGGGGG 180 

OCCCCGTCCT CCAGACCTGG CTGCAGGACC TGCTGCGTCO TGGGCTGGTG CGGGCTGCCC 240 

AGAGCACAGG AGCCTGGATT GTCACTGGGG GTCTGCACAC GGGCATCGGC CGGCATGTTG 300 

GTGTGGCTQT ACGGOACCAT CA CATCC CCA CCACTGSGGG CACCAAGCTC GTGGCCATGG 360 

_ GTGTGGCCCC CTGGGGTGTG GTCCGGAATA GAGACACOCT CAICAACCCC AAGGGCTCGT 420 

35 TCCCTGCGAG GTACOGGTGG CGOGGTGACC CGGAGGACGG GGTCCAGTTT CCCCTGGACT 480 

ACAACTACTC GGCCTTCTTC CTGGTGGACG ACGGCACACA CGGCTGCCTG GGGGGCGAGA 540 

ACCGCTTGCG CTTGOGCCTG GAGTCCTACA TCTCACAGCA GAAGACGGGC GTGGGAGGGA 600 

CTGGAATTGA CATCOCTGTC CTGCTCCTCC TGATTGATGQ TGATGAGAAG ATGTTGACGC 660 

GAATAGAGAA CGCCACCCAG GCTCAGCTCC CATGTCTCCT CGTGGCTGGC TCAGGGGGAG 720 

40 CTGCGGACTG CCTGGCGGAG ACCCTGGAAG ACACTCTGGC CCCAGGGAGT GGGGGACCCA 780 

GGCAAGGCGA AGCCCGAGAT CGAATCAGGC GTTTCTTTCC CAAAGGGGAC CTTGAGGTCC 840 

TGCAGGCCCA GGTGGAGAGG ATTATGACCC' GGAAGGAGCT CCTGACAGTC TATTCTTCTG 900 

AGGATGGGTC TGAGGAATTC GAGACCATAG TTTTGAAGGC CCTTGTGAAG GCCTGTGGGA 960 

GCTCGGAGGC CTCAGCCTAC CTGGATGAGC TGCGTTTGGC TGTGGCTTGG AACCGCGTGG 1020 

45 ACATTGCCCA GAGTGAACTC TTTCGGGGGG ACATOCAATG GCGGTCCTTC CATCTCGAAG 1080 

CTTCCCTCAT GGACGCCCTG CTGAATQACC GGCCTQAGTT CGTGCGCTTG CTCATTTCCC 1140 

ACGGCCTCAG CCTGGGCCAC TTCCTGACCC CGATGCGGCT GGCCCAACTC TACAGCGCGG 1200 

CGCCCTCCAA CTCGCTCATC CGCAACCTTT TGGACCAGGC GTCCCACAGC GCASGCACCA 1260 

cn AAGCCCCAGC CCTAAAAGGG CGAGCTGCGO AGCTCCGGCC CCCTQACGTG GGGCATGTGC 1320 

50 . TGAGGATGCT GCTGGGGAAG ATGTGCGCGC GGAGGTACCC CTCCGGGGGC GCCTGGGACC 1380 

CTCACCCAGG CCAGGGCTTC GGGGAGAGCA TGTATCTGCT CTCGGACAAG GCCACCTCGC 1440 

OGCTCTCGCT GGATGCTGGC CTCGGGCAGG CCCCCTGGAG CGACCTOCTT CTTTGGGCAC 1500 

TGTTGCTGAA CAGGGCACAG ATGGCCATGT ACTTCTGGGA GATGGGTTCC AATGCAGTTT 1560 

_ _ CCTCAGCTCT TGGGGCCTGT TTGCTGCTCC GGGTGATOGC ACGCCTGGAjS CCTQACGCTG 1620 

55 AGGAGGCAGC ACGGAGGAAA GACCTGGCGT TCAAGTTTGA GGGGATGGGC GTTGACCTCT 1680 

fTGGCGAGTG CTATOGCAGC AGTGAGGTGA GGGCTGCOCG CCTCCTCCTC CQTCGCTGCC 1740 

CGCTCTGGGG GGATGCCACT TGCCTCCAGC TGGCCATSCA ASCTGACGCC CGTGCCTTCT 1800 

TTGCCCAGGA TGGGGTACAG TCTCTGCTGA CACAGAAGTG GTGGGGAGAT ATGGCCAGCA 1860 

, n CTACACCCAT CTGGGCCCTO GT TCTCGCCT TCTTTTGCCC TCCACTCATC TACACCCGCC 1920 

60 TCATCACCTT CAGGAAATCA GAAGAGGAGC CCACACGGGA GGAGCTAGAG TTTGACATGG 1980 

ATAGTGTCAT TAATGGGGAA GGGCCTGTCG GGACGGCGGA CCCAGCCGAG AAGACGCCGC 2040 

TGGGGGTCCC GCGCCAGTCG GGCCGTCCGG GTTGCTGCGG GGGCCGCTGC GGGGGGCGCC 2100 

GGTGCCTACG CCGCTCGTTC CACTTCTGGG GCGCGCCGGT GACCATCTTC ATGGGCAACG 2160 

TGGTCAGCTA CCTGCTOTTC TTGCTGCTTT TCTCGCGGGT GCTGCTCGTG GATTTCCAGC 2220 

65 CGGCGCCGCC CGGCTCCCTG GAGCTGCTGC TCTATTTCTG GGCTTTCACG CTGCTGTGCG 2280 

AGGAACTGCG CCAGGGCCTG AGCGGAGGCG GGGGCAGCCT CGCCAGCGGG GGCCCCCGGC 2340 

CTGGCCATGC CTCACTGAGC CAGCGCCTGC GCCTCTACCT CGCCGACAGC TGGAACCAGT 2400 

GCGACCTAGT GGCTCTCACC TGCTTCCTCC TGGGCGTGGG CTGCCGGCTG ACCCCGGGTT 2460 

TGTACCACCT GGGCCGCACT GTCCTCTGCA TCGACTTCAT GGTTTTCACG GTGCGGCTGC 2520 

70 TTCACATCTT CACGGTCAAC AAACAGCTGG GGCCCAAGAT CGTCATCGTG AGCAAGATGA 2580 

TGAAGGACGT GTTCTTCTTC CTCTTCTTCC TCGGCGTGTG GCTGGTAGCC TATGGCGTGG 2640 

CCACGGAGGG GCTCCTGAGG CCACGGGACA GTGACTTCCC AAGTATCCTQ CGCCGCGTCT 2700 

TCTACCGTCC CTACCTGCAG ATCTTCGGGC AGATTCCCCA GGAGGACATG GACGTGGCCC 2760 

TCATGGAGCA CAGCAACTGC TCGTCGGAGC CCGGCTTCTG GGCACACCCT CCTGGGGCCC 2820 

75 AGGCGGGCAC CTGCGTCTCC CAGTATGCCA ACTGGCTGGT GGTGCTGCTC CTCGTCATCT 2880 

TCCTGCTCGT GGCCAACATC CTGCTGGTCA ACTTGCTCAT TGCCATGTTC AGTTACACAT 2940 

TCGGCAAAGT ACAGGGCAAC AGCGATCTCT ACTGGAAGGC GCAGCGTTAC CGCCTCATCC 3000 

GGGAATTCCA CTCTCGGCCC GCGCTGGCCC CGCCCTTTAT CGTCATCTCC CACTTGCGCC 3060 

TCCTGCTCAG GCAATTGTGC AGGCGACCCC GGAGCCCCCA GCCGTCCTCC CCGGCCCTCG 3120 

80 AGCATTTCCG GGTTTACCTT TCTAAGGAAG CCGAGCGGAA GCTGCTAACG TGGGAATCGG 3180 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGGAGAGC GACTCCGAGC 3240 

GTCTGGAGCG CACGTCCCAG AAGGTGGACT TGGCACTGAA ACAGCTGGGA CACATCCGCG 3300 

AGTACGAACA GCGCCTGAAA GTGCTGGAGC GGGAGGTCCA GCAGTGTAGC CGCGTCCTGG 3360 

GGTGGGTGAC GTAGGCCGTT AGCAGCTCTG CCATGTTGCC CTCAGCTGGG CCGCCACCCC 3420 

■TTGACCTGCA TGGGTCCAAA GAGTGAGGCA TGCTGGOGQA TTTTAAGGAG AAGOCCCCAC 3480 

AGGGGATTTT GCTCTTAGAG TAAGGCTCAT GTGGGCCTGG GCCCCCGCAC CTGGTGGCCT 3540 

TGTCCTTGAG GTGAGCCCCA TGTCCATCTG GGCCACTGTC AGGACCAOCT TTGGGAGTGT 3600 

CATCCTTACA AACCACAGCA TGCCCGGCTC CTCCCAGAAC CAGTCCCAGC CTGGGAGGAT 3660 

CAAGGCCTGG ATCCC66GCC GTTATCCATC TGGAGGCTGC AGGGTCCTTG GGGTAACAGG 3720 

GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTQO GQAAATAAAG CCATTTCAGA 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

SEQ ID NO:106 PEU5 Protein sequence 
Prated Accession I: NPJD601O6 



1 11 21 31 41 51 

I I I I I I 

HASTGOTKW AMGVAPWGW ENRDTLINPK GSPPARYRHR GDPEDGVQPP LBYNYSAFFL 60 

VDDGTHGCLG GENRFRLRLE SYISQQKTGV GGTGIDIPVL LLLZDGDEKH LTRIENATQA 120 

20 QLFCU.VAGS CGAADCLABT LEDTLAPGSG GARQGBARDR 1KRPPPKGDL EVLQAQVERI 180 

MTRKHLLTVY SSBDGSEEFB TIVLKALVKA CGSSBASAYIt DELRLAVAWN RVDIAQSELP 240 

RGDIQHRSFH IiBASUfflALL NDRFEFVRLL ISHGLSLGHF LTEHRLAQLY SAAPSNSLIR 300 

NLLDQASBSA GTKAPALKGG AAELRPPDVG EVLRMLW3KM CAPRYPSGGA WDPHPGQGPG 360 

_ _ BSHXLLSOKA TSPLSLDAGL GQAPWSDLLfc WALLRJRAQM AKfFWEMGSN AVSSALGACL 420 

25 LLRVMARLEP DABEAARRKD LAFKFEGMGV DLFGBCYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR AFFAQDGVQS LLTQKWWGDH ASTTPIWALV LAFFCPPLIY TRLITPRKSE 540 

EEPTREELEF DHDSVOGB6 PVGTADPABK TPUJVPRQSG RPGCCGGRCG GRRCLRKWFH 600 

PWGAFVTIFM OJWSYLLPL LLFSRVLLVD PQPAPPGSLE LLLYFWAPTL LCEELRQGLS 660 

GGGGSLASGG PGPGHASLSQ RLRLYLRDSW NQCDLVALTC FLLGVSCRLT PGLYHLGRTV 720 

30 LCIDPHVPTV RLLHIFTVNK QLGPKIVIVS KMHKDVFFPL FFLGVWLVAY GVATE6LLRP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEDMD VAtMEHSSCS SEPGFWAHPP CAQACTCVSQ 840 

YANWLWLLL VIFLLVANXL LVNLLIAHFS YTFGKVQGNS DLYWKAQRYR LIREFHSRPA 900 

IAPPFIVISH LRLLLRQLCR RPRSPQPSSP ALEHFRVYLS KEABRKLMW ESVHKENFLL 960 
ARAHDKRESD SKRLERTSQK VDLALKQLGB IREYEQRLKV LERBVQQCSR VLGWVT 



SEQ ID NO:107 PEW3 ONA SEQUENCE 

Nudeta Add Accession t: NMJ05982 

Coding sequence 276-1 1 30 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGTAGCAGCA TCCACCGGGC GGGAGG7CGG AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

C6CCGGCCGT TCCGTGTCCA GAACCTCCCC TACTCCTCCG CCTTCTCTTC CTTGGCCGCC 120 

CACCGCCAAG TTCCGACTCC GGTTTTCGCC TT7GCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

AG0C6CG0CC CCCTCCCTGC G6CCGCCGCC CCCTGCCTCT CGGCTCTGCT CCCTGCCGCO 240 

TGC6CCTGGG CCGTGCGCCC CGGCAGGCGC CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

TTACGCAGGA GCAAGTGG06 TGCGTGTGCG AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

GCCTGGGCAG GTTCCTGTGG TCACTGCCCG CCTGCGACCA CCTGCACAAS AACGAGAGCG 420 

mCTCAAGGC CAAGGCGGTG GTCGOCTTCC ACCGCGGCAA CTTOCGTGAG CTCTACAAGA 480 

TCCTGGAGAG CCACCAGTTC TCGCCTCACA ACCACCCCAA ACTGCAGCAA CTGTGGCTGA 540 

AGGCGCATTA CGTGGAGGCC GAGAAGCTGC GCGGCCGACC CCTGGGCGCC GTGGGCAAAT 600 

ATCGGGTGCG CCGAAAATTT CCACTGCCGC GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

ACTGCTTCAA GGAGAAGTCG AGGGGTGTCC TGCGGGAGTG GTACGCGCAC AATCCCTACC 720 

CATCGCCGCG TGAGAAGOGG GAGCTGGCCG AGGCCACOGG CCTCACCACC ACCCAGGTCA 780 

GCAACTGGTT XAAGAACCGG AGGCAAAGAG ACOGGGOCGC GGAGGCCAAG GAAAGGGAGA 840 

ACACCGAAAA CAATAACTOC TCCTCCAACA AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 . 

GCAAGCCGCT CATGTCCAGC TCAGAAGAGG AATTCTCAOC TCCCCAAAGT CCAGACCAGA 960 

ACTCGGTCCT TCTGCTGCAG GGCAATATGG GCCACGCCAG GAGCTCAAAC TATTCTCTCC 1020 

CGGGCTTAAC AGCCTCGCAG CCCAGTCACG GCCTGCAGAC CCAOCAGCAT CAGCTCCAAfl 1080 

ACTCTCTGCT CGGCCCCCTC ACCTOCAGTC TGGTGGACTT GGGGTCCTAA GTGGGGAGGG 1140 

ACTGGGGCCT CGAAGGGATT CCTGGAGCAG CAACCACTOC AGCGACTAGG GACACTTGTA 1200 

AATAGAAATC AGGAACATTT TTGCAGCTTG TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

GTGSACTTTC ACAAATATCT TTTTAAAAAT CAAAACCAAC AGCGATCTCA AGCTTAATCT 1320 
CCTCTTCTCT CCAACTCTTT CCACTTTTGC ATTTTCCTTC CCAATGCAGA GATCAGGG 

SEQ ID NO:108 PEW3 Protein semrance 
Protein Accession*: NP_005973 



__ 1 11 21 31 41 51 

70 | | | | | | 

HSMLPSFGFT QEQVACVCEV LQQGGNLERL GRPLWSLPAC DHLHKNESVL KAKAWAFHR 60 

GNFRELYKIL ESHQFSPHHH PKLQQLWLKA HYVEAEKLRG RPLGAVGKYR VRRKPPLPRT 120 

IWDGEETSYC FKEKSRGVLR EWYAHMPYPS PRBKRELAEA TGLTTTQVSN HFKNRRQRDR 180 

AAEAKERENT EKNNSSSNKQ NQLSPLEGGK PLHSSSEEEF SPPQSPDQNS VLLLQGNMGH 240 ■ 
ARSSNVSLPG LTASQPSHGX. QTHQHQLQDS LLGPLTSSLV DLGS 



80 



SEQ tDHQ;109PFJa ONA SEQUENCE 

Nucleic Add Accession t. NM.005069 

Coding sequence: 57-2060 (underlined sequences corraspond to start and stop codons) 
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I U 21 31 41 SI 

I I I I I I 

GGOGCTCCGC GGGOCTGGAG CACGGCCGGG TCTAATATGC OOGGAGOOGA GGCGCGATG.A 60 
AGO AOAAOTC CAAGAATGCG GCCAAGACCA GOAQGG AG AA GQAAAATGGC G AGTTTTACG 120 
AGCITGOCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA ISO 
TCATCCGCCT CACCACGAGC TACCTGAAGA TGCGCGCCGT CTTCOCCGAA GGTTTAGGAG 240 
ACGCGTGGGG ACAGCOGAGC CGCGCCGGGC CCCTGGACGG CGTCGCCAAG GAGCTGGGAT 300 
COCACTTGCT GCAOACTTTO GATGGATTTG 1 111 IU1 GGT AGCATCTOAT GGCAAAATCA 360 
TGTATATATC OGAG ACOGCT TCTGTCCATT TAGGCTTATC CCAGGTGGAG CTCACGGGCA 420 
ACAGTATTTA TGAATACATC CATCCTTCTG ACCACG ATG A GATGAOCGCT GTCCTCACGG 480 
CCCACCAGCC GCTGCACCAC CACCTGCTCC AAGAGTATGA GATAGAGAGG TOGTTCTTTC 540 
TTCG AATGAA ATGTGTCTTG GCGAAAAGG A AOGCGGGCCT GACCTGCAGC GGATACAAGG 600 
TCATCCACTG CAGTGGCTAC TTQAAGATCA GGCAGTATAT GCTGQACATG TCCCTGTACG 660 
ACTCCTGCTA CCAGATTGTG GGGCTGGTOO OGGTGGGCCA GTCGCTGOCA CCCAGTGCCA 720 
TCACCGAG AT CAAGCTGTAC AGTAACATGT TCATGTTCAG GGCCAGCCTT GACCTGAAGC 780 
TGATATTCCT GG ATTCCAGG GTO ACCGAGG TGACGGGTTA CGAGOOGCAG GACCTGATOG 840 
AGAAGAOCCT ATACCATCAC GTGCACGGCT GCGAOGTGTT GCACCTCCGC TACGCACAOC 900 
ACCTCCTGTT GGTGAAGGGC CAGGTCAGCA CCAAGTACTA COGGCTGCTG TCCAAGCGGG 960 
GCGGCTGGGT GTGGGTGCAG AGCTACGCCA CCGTGGTGCA CAACAGCCGC TCGTCCCGCC 1020 
COCACTGCAT CGTGAGTGTC AATTATGTAC TCAGGG AGAT TGAATACAAG GAACTTCAGC 1080 
TGTOCCTGGA GCAGGTGTCC ACTGCCAAGT CGCAGGACTC CTGGAGGACC GCCTTGTCTA 1140 
CCTCACAAGA AACTAGG AAA TTAGTGAAAC CCAAAAATAC CAAGATGAAG ACAAAGCTGA 1200 
GAACAAAGCC TTACCCCCCA CAGCAATACA GCTCGTTCCA AATGGACAAA CTGGAATGCG 1260 
GCCAGCTCGG AAACTGGAGA GCCAGTCGCC CTGCAAGOGC TGCTGCTCCT CCAGAACTGC 1320 
AGOCCCACTC AGAAAGCAGT G ACCTTCTOT ACACGCCATC CTACAGCCTG OCCTTCICCT 1380 
ACCATTACGG ACACTTOCCT CTGGACTCTC ACGTCTTCAG CAGCAAAAAG CCAATGTTGC 1440 
CGGCCAAGTT CGGGCAGCCC CAAGGATCCC CTTGTGAGGT GGCACGCTTT TTCCTGAGCA 1300 
CACTGOCAGC CAGCGGTG AA TGGCAGTGGC ATTATGGCAA CGOCCTAGTG CCTAGCAGCT 1560 
CGTCTCCAGC TAAAAATCCT CCAGAGCCAC CGGOG AACAC TCCTAGGCAC AGCCTGGTGC 1620 
CAAGCTACGA AGCGOCCGCC GCCGCCGTGC GCAGGTTGGG CG AGGACACC GCGCCCCCG A 1680 
GCTTCCOGAG CTGCGGGCAC TACCGCGAGG AGCCCGCGCT GGGCCCGGCC AAAGCCGCCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG OGCTGGCCOG CGOGGCACCC GAGTGCTGCG 1800 
CGCCCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GOXTTCGTG CTGCTCAACT 1860 
ACCACCGCGT GCTGGCCCGG CGCGGACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 
TGGCCTGCGC TCOOGGGGGC CCCGAGGOGG CGACCGGCGC GCTGCGGCTC CGGCACCCGA 1980 
GCCCCGCCGC CACCTCCCCG CCCGGCGCGC CCCTGCCGCA CTACCTGGGC GCCTCGGTCA 2040 
TCATCACCA A CGGQAGGTGA CCCGCTGGCC GCCCGCGCCA GGAGCCTGOA CCCGGCCTOC 2100 
CGGGGCTGCG GCGCCACCG A GCCCGGCAAA TCCGCACG AC CTACATTAAT TTATGCAG AG 2160 
ACAGCTGTTT G AATTGGAGC GCGGCGCCGA CTTGGGG ATT TCCACCGCGG AGGCCGGGCG 2220 
CGCCGGTGCC G AGGGOCGAG GAGCGCCCGG GTCCGGGCAG GTGACCGCCC GCCTCTGTCC 2280 
TGCGAGGGCC GGTGCGACCC AGTTGCTGGG GGCTIGGTITCCTCACCTTG AAATOGGGCT 2340 
TCACGCGTCT TGCCTTGTCC CCAACGTTCC ACAACAGTCC CGCTGGGGGA TTGAAGCGGT 2400 
TTCACTCCGC AAATATCCTC CACTTTCAGG AGGGAAAACC CACCCTACCA CAGTCCGCTC 2460 
TTCCAAGTGG AGGGCAGACC TGGGAGGGGA CGCCTGTGTC ACGAGCCCTT TTAGATGCTT 2520 
AGGTGAAGGC AGAAGTGATG ATTGTAAGTC CCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA GAGTCTCATT ATTTTTGTTT TTATTTAACC CTTTCTTCAA TACAAAAAGC 2640 
CAACAAACCA AGACTAAGGG GGTGACCATG CAATTCCATT TTGTGTCTGT GAACATAGGT 2700 
GTGCTTCCCA AATACATTAA CAAGCTCTTA CTTCCCCCTA ACCOCTATG A ACTCTTGATA 2760 
ACACCAAGAG TAGCACCTTC AGAATATATT GAATAGGCAT TAAATGCAAA AATATATATG 2820 
TAGCCAGACA GTTTATGAGA ATGACCCTGT CAAGCTTCAT TATTACGTGG CAAAATCCCT 2880 
CTGGOCCACA CAGATCTGTA ATTCACTAGG CTCGTGTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC AOGTGCAATA CGGAACACTG TCAATGGACT GCACCTTGTG AAGGAAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATGATGT AGACATTTTA AGCATTTTCT ACACAGOGAG 3060 
AAAACTTOGT AAGAACATGT TAOGTGTGCA ACAGGTAAAC AGAAATCCTT TCATAAAGCA 3120 
CCAGCAGTGT TTAAAAAATG AGCTTCCATT AATTTTTACT TTTTATGGGT TTTGCTTAAA 3180 
GATCTCAACA TGGAAAAATC CTGTCATGGC TCTGAACTGC ACAATGCATT GAACOGOCGT 3240 
CCTTCAATTT TCTTCACACT ATCAACACTG CAGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGGAA A CTTTTTCC A CCTTTCTGAA TGGAAAGAGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATCAAGT GCACCTACAC CAACTGCTCT CAAAATGTG A ACTGACTTTT 3420 
I I 11 1 1 1 1 IT TTI 1G CCAAC CCTGTGTCAC TTAGTGAGGA CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTCA GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGGA CCGTGGGTCA 3540 
TGCAGCGAAG GGGCTGGATO GTAGGAAGGG ATGTGCCCGC CTCTCCACGC ACTCAGCTAT 3600 
ACCTCATTCA CAGCTCCTTG TGAGTGTGTG CACAGGAAAT AAGCCGAGGG TATT Ai II I I 3660 
TTATGTTCAT GAGTCTT GTA A TTAAACCGT G ATTCTTGAA AGGTGTAGGT TTGATTACTA 3720 
GGAGATACCA CCGACATTTT TCAATAAAGT ACTGCAAAAT GCTTTTGTGT CTAOCTTGTT 3780 
ATTAACnTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGGAGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTGAGAAA AAAGACCCTA TCATAGATTT ACAAG 



SEQ IDNO:110 PFJ8 Proltin sequence: 
Proieln Accession I: NP_005060.1 

1 11 21 31 41 51 

I I I I I I _ 

MKEKSKNAAK TRREKENGEF YELAKLLFLP SATTSQLDKA SHRLTTS YL KMRAVFFEGL 60 
GDAWGQPSRA GPLDG V AKEL GSHIXQTLDG FVFW ASDGK IMYBETAS V HLGLSQVELT 120 
GNSIYEY1HP SDHDEMTAVL TAHQPLHHHL LQEYHERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK KQYMLDMS L YDSCYQIVGL VAVGQSLPPS AITEIKLYSN MFMFRASLDL 240 
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KLIFLOSKVT BVTGYEPQDL IEKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATWHNSRSS RPKOVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQBTRKLV KPKNTKMKTK LRTNPYPPQQ YSSBQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTFSYSLFF SYHYCHFPLD SHVFSSKKPM LPAKFCQFQO SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPS YEAPAAA VRRPGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLFFVLL 600 
NYHRVLARRG PLCGAAPAAS GLACAPGGPE AATOALRLRH PSPAATSPPG AFLPHYLGAS 660 
VTJTNGR 



SEQtDNCklll PFJ7 DMA SEQUENCE 

Nucleic Add Accession f: NH.006549 

Coding sequence 1-1254 (uxleib^Gea^erx^ correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

AXQAAOGGAC GCTCCATCTG COCCTCCCTO CCCTACTCAC CCGTCAOCTC CCCQCAGTCC 60 
TCGCCTCGGC TGCCCCGGCO GOCGACAOTG GAGTCTCACC ACGTCTCCAT CACGGGTATG 120 
CAGGACTGTO TCCAGCTCAA TCAOTATACC CTOAAGGATG AAATTCGAAA GGGCTCCTAT 180 
GGTGTCGTCA AGTTGGOCTA CAATGAAAAT QACAATACCT ACTATGCAAT GAAGGTGCTG 240 
TOCAAAAAGA AGCTGAT0OG GCAGGCCGGC TTTCCACGTC GCCCTCCACC CCGAGGCACC 300 
CGGCCAGCIC CTGGAGGCTG CATCCAGCOC AGGGGCCCCA TTGAGCAGGT GTACCAGGAA 360 
ATTGCCATCC TCAAOAAGCT GGACCACCCC AATGTGGTGA AGCTGGTGGA GGTCCTGGAT 420 
GACCCCAATQ AGGACCATCT GTACATGGTG 1TCGAACTGG TCAACCAAGG GCCCOTGATO 480 
GAAGTGCCCA OCCTCAAACC ACICTCFGAA GACCAGGCCC GTTTCTACTT OCAGGATCTG 540 
ATCAAAGGCA TCOAGTACTT ACACTACCAO AAGATCATCC ACCGTGACAT CAAAOCTTCC 600 
AACCTCCTGG TCGOAGAAGA TGGGCACATC AAGATCGCTG A LT11U GTGT GAGCAATGAA 660 
TTCAAGGGCA GTGACGCGCT CCTCTCCAAC ACCGTGGGCA CGCCCGCCIT CATGGCACGC 720 
GAGTCGCTCT CTGAGACCCG CAAGATCTTC TCTGGGAAGG CCTTGGATGT TTGGGCCATG 780 
GGTGTGACAC TATACTGCTT TGTCTTTGGC CAGTGOCCAT TCATGGACGA GCGGATCATG 849 
TGTTTACACA GTAAGATCAA GAGTCAGGCC CTGGAATTTC CAGACCAGCC CGACATAGCT 900 
GAGGACTTGA AGGACCTC AT CACCCGTATG CTGGACAAGA ACCCCGAGTC GAGGATCGTG 960 
GTGCCGG AAA TCAAGCTGCA CCOCTGGGTC ACGAGGCATG GGGCGGAGCC GTTGCCGTCG 1020 
GAGGATGAGA ACTGCAOGCT GGTGGAAGTG ACTGAAGAGG AGGTOGAGAA CTCAGTCAAA 1080 
CACATTCCCA GCTTGGCAAC CGTGATCCTG GTGAAGACCA TGATA0GTAA ACGCTCCTTT 1140 
GGGAACGCAT TCGAGGGCAG CCGGCGGGAG GAACGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCAGCAAAA AACCAACCAG GGAATGTGAG TCOCTGTCTG AGCTCAAGAC CJAgAAAATA 1260 
AGTCCCCTTC CTGCCTGTTG CAAAGTAACG TAAG AGTTCC CTCACGCGAG TCGATGCAG A 1320 
CGTTCTTGCT GTCAGCCACC TTOCTTCATA CACATAGCCA GGCCAGGGTG ACCAGAACGT 1380 
CCCAGGACAG ATGAGGCTTT GTGTCCTTAT GAGAGTGGGA GAACCIGGTG GGCACOOCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
CCTG ACTTGG TGGGAGTTCC ATTCAGTCAC TTCTGTTTCT TAAACATAGC TTTACTGAGG 1560 
TACAATTCAC ATACCATGTA ATTCACCCAC GGGAAOTGTA TGATTCAGTG OTTTCTAATA 1620 
CACACTTCTG CAGCCATTAC CACCGTCAAC TTTACG ACAT TTTCATCAGC CCAAGAAGAC 1680 
ACCCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCXACC CCAGTAACCA CICAGAATAG 1740 
GTATGGATTT GCCTATTCTG GAOGTTTCGT ATAAATGGCG TCATACACTA AAAAAAAAAA 1800 
AAAA 



SEQ ID HCH112 PFJ7 Protein seouenca 
Protein Accession f: NP_00654ai 

1 11 21 31 41 51 
I I I I I I 

MNGRQCPSL PYSPVSSPQS SPRLPRRPTV ESHHVSITGM QDCVQLNQYT LKDEIGKGSY 60 
GWKLAYNEN DNTYYAMKVL SKKKLIRQAG FPRKPPPRGT RPAPGGCIQP RGPIEQVYQE 120 
IAJLKKLDHP NWKLVEVLD DPNEDHLYMV FELVNQGPVM EVPTLKPLSE DQARFYRJDL 180 
KGEYLHYQ KHHRDIKPS NLLVGEDGHI KIADFGVSNB FKGSDALLSN TVGTPAFMAP 240 
ESLSETRKIF SGKALDVWAM GVTLYCFVFG QCPFMDEPJM CLHSKIKSQA LEFPDQPDIA 300 
EDLKDLTTRM LDKNPESRIV VPEKLHPWV TRHGAEPLPS EDENCTLVEV TEEEVENSVK 360 
MPSLATVIL VKTMIRKRSF GNPFEGSRRE ERSLSAPGNL LTKKPTRECB SLSELKT 



SEQ IOMft113PFJ6 ONA SEQUENCE 

Nucleic Add Accession I: NM_021810 

CrxCnj sequencer 1-429 (underthed sequences correspond to statand stop oodons) 

1 11 21 31 41 51 
I I 1 I I I 

AISAAACCTC TGATATGG AC ATGGTCAGAT GTTGAAGGCC AG AGGCCGGC TCTGCTCATC 60 
TGCACAGCTG CAGCAGGAGC CAOGCAGGGA GTTAAGGGTT ATGGCAAGCC CTTTGAGGCA 120 
AGAAGTGTGA AAAACATACA CTCTACTCCT GCTTACCCAG ATGCCACAAT GCACAGACAA 180 
CTCCTGGCTC CGGTGGAAGG AAGG ATGGCA GAGACATTG A ATCAG AAACT CCATGTrGCC 240 
AATGTGCTGG AAOATGACCC OGGCTACCTA CCTCACGTCT ACAGCG AGO A AGGGG AGTGT 300 
GGAGGGGCCC CATCCCTCAG CTCTCTGGCC AGCTTGGAAC AGGAGTTGCA ACCTGATTTG 360 
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CTGOACTCTT TGGGTTCAAA AGOGACTCCG TTTOAGOAAA TATATTCAGA GTCAGGTGTT 430 
CCTTCCTAA 



SEQ ID KftWPFJB Protein seouence: 
Proleln Accession ft NP.068582.1 

1 11 21 31 41 51 

MKHJWTWSD VEGQRPAIJJ CTAAAGFTQG VKGYGKPFEP RSVKNIHS1P AYPDATMHRQ 60 
LLAPVEGRMA BTLNQKLHVA NVLEDDPGYL PHVYSEEGEC GGAPSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEEIYSESGV PS 



SEQ ID KftllS PFJ5 DNA SEQUENCE 

Nucleic Add Accession ft NMJM6361 

Cooing sequence: 131^(una^edseo^encescon^pondtoslartandslopcodons) 
1 11 21 31 41 SI 

I I I I I I _ 

CGAATGCAGG CGACTTGCGA GCTGGGAGCG ATTTAAAACX3 CnTGGATTC CCOCGGOCTG 60 
GGTGGGGAOA GCG AGCTGGG TGCCCCCTAG ATTCCCCGCC OCOGCACCIC ATGAGCCGAC 120 
CCTCGGCTCC ATGGAGCCCG GCAATTATGC CACCTTCGAT GGAGCCAAGG ATATCGAAGG 180 
CTTGCTGGGA GOGGG AGGGG GGCGGAATCT GGTCGCCCAC TCCCCICTGA CCAGCCACCC 240 
AGCGGCGCCT ACGCTGATGC CTGCTGTCAA CTATGCCCCC TTCGATCTGC CAGGCTCGGC 300 
GGAGCCGCCA AAGCAATGCC ACCCATGCCC TGGGGTGCCC CAGGGGACGT CCCCAGCTCC 360 
CGTGCCTTAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTCCC GGAGCTCGCT 420 
QAAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 480 
GGAAGAGTAC COCAGTOGOC OCACTGAGTT TGCCTICTAT OOGGGATATC OGGGAACCTA 540 
CCACGCTATG GCCAGTTACC TGGACGTGTC TGTGGTGCAG ACTCTGGGTG CTCCTGGAGA 600 
ACCGCGACAT GACTCCCTGT TGCCIGTGGA CAGTTACCAG TCTTGGGCTC TCGCTGGTGG 660 
CTGGAACAGC CAGATGTGTT GCCAGGGAGA ACAGAACCCA CCAGGTCCCT TTTGGAAGGC 720 
AGCATTTGCA GACTCCAGCG GGCAGCACCC TCCTG ACGCC TGCGCCTTTC GTCGCGGCCG 780 
CAAGAAACGC ATTCCGTACA GCAAGGGGCA GTTGCGGG AG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGCG CAAGATCTCG GCAGOCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGAACCG CCGGGTCAAA GAGAAGAAGG TTCTCGOCAA 960 
GGTGAAG AAC AGCGCTACCC CTTAAGAGAT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTC 1020 
GGGGTGTCCT GGGGAGACC A G AAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 
AGAGGCCCCT AGAGACAACA CCCTTCCCAG GCCACTGGCT GCTGGACTGT TCCTCAGGAG 1140 
CGGCCTGGGT ACCCAGTATG TGCAGGG AG A CGGAACCCCA TGTGACAGGC CCACTCCACC 1200 
AGGGT7XCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 



SEQ ID HO-.U6 PFJ5 Protein sequence: 
Protein Accession J: NP_O063S2.1 

1 11 21 31 41 51 
I I I i I I 

MEPGNYATLD GAKDIEGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGSAEPP 60 
KOCHPCPGVP QGTSPAPVPY GYFGGGYYSC RVSRSSLKPC AQAATLAAYP AETPTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM ASYLDVSVVQ TLGAPGEPRH DSLLPVDSYQ SWALAGGWNS 180 
QMCCQGEQNP PGPFWKAAFA DSSGQHPPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
ITKDKRRKIS AATSLSERQI TIWFQNRRVK EKKVLAKVKN SATP 



SEQ ID NO°.117 PFJ4 DNA SEQUENCE 

Nucleic Add Accession ft NM_005828 

Coding sequence: 591-2216 (underlined sequences correspond to start and stop cottons) 

1 11 21 31 41 51 
I I I I I I 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTOCCGGAC CTAAGAGCCT GGGTCCCCTG TTTCCGGAGG TCCGCTTCCC GGCCCCCAGA 120 
TTCTGGCATC CCAGCCCTCA GTGTCCAAGA CCCAGGCAGC CCGGGTCCCC GCCTCCCGGA 180 
TCCAGGCGTC CGGGATCTGC GCCACCAG AA CCTAGCCTCC TGCAGACCTC CGGCATCTGG 240 
GGGCACTCAA CCTCCTGGAG CCAAGGGCCC CACGTCCCAC CCA GAGA AAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCXX5GGCGC TCCGAACTCC CAGCTTTCGG ACATCTGGCA 360 
CACGGGGCAG AGCAGAGAAG CTCAGCGCCC AGCCTGGGGA ATTTAAACAC TCCAGCTTCC 420 
AAGAGCCAAG GAACTTCAGTGCTGTGAACT CACAACTCTA AGGAGCCCTC CAAAGTTCCA 480 
GTCTCCAGGT GCTGTTACTC AACTCAGTCC TAGG AACGTC GGGTCCTGGG AAGG AGCCCA 540 
AGCGCTCCCA GCCAGCnCC AGGCGCTAAG AAACCCCGGT GCTTCCCATC ATGGTGGCCG 600 
ATCCTCCTCG AG ACTCCAAG GGGCTCGCAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTCGCCTC CATCG AGGAC CAAGGCGCGG CAGCAGGCGG CTACTGCGGT TOOCGGGACC 720 
AGGTGCGCCG CTGCCTTCGA GCCAACXTGC TTGTGCTGCT GACAGTGGTG GCCGTGGTGG 780 
CXX3GCGTGGC GCTGGGACTG GGGGTGTCGG GGGCCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AGOGCTTO AG CGCCTTOGTC TTCCCGGGCG AGCTGCTGCT GCGTCTGCTG CGGATOATCA 900 
TCTIGCCGCr GGTGGTGTGC AGCTTGATCG GCCGCGCCCC CAGCCTOQAC CCCOOCOCGC 960 
TCGGCCGTCT GGGCGOCIGG GCGCTGCTCT TTTTOCTGGT CAOCAOGCTG CTGGCGTCGG 1020 
CGCTCGGAGT GGGCTTGGCG CTGGCTCTGC AGCOGGGCGC CGCCTCOGCC GCCATCAAGG 1080 
CCTOCGTGGG AGCCGCGGGC AGTGCGOAAA ATGCOCCCAG CAAGGAGGTG CTCGATTOGT 1140 
TCCTGGATCT TGOQAGAAAT ATCTTCOCTT OCAAOCTGGT GTCAGCAGCC TTTOGCTCAT 1200 
ACTCTACCAC CTATGAAGAG AGGAATATCA CCGGAACCAG GGTGAAGGTG CCCGTGGGGC 1260 
AGGAGGTGGA GGGGATGAAC ATGCTGGGCT TOGTAGTCTT TGOCATOGTC TTTGGTGTGG 1320 
COCTGCXX} AA GCTGGGGCCT GAAGGGGAGC TGCTTATCCO CTTCTTCAAC TGCTTCAATG 1380 
AGGCCACCAT GG 1 1CTGOTC TCCTGGATCA TOTCGTACOC CCCIGTOOOC ATCATOTICC 1440 
TGGTGGCTGG CAAGATCGTG GAGATGGAGG ATGTGGGTTT ACTCmGCC CG CC T T GGCA 1500 
AGTACATICT GTCCTGCCTG CTGGGTCAOG CCATCCATGG GCTCCTGGTA CTGCCOCTCA 1560 
TCTACTTOCT CTTCAOCCGC AAAAACCCCT ACCGCTTCCT GTGGGGCATC GTGACGCCGC 1620 
TGGCCACTGC CTTTGGGACC TCTTCCAGTT CCGCCACGCT GCCGCTGATO ATGAAGTGCG 1680 
TGGAGGAGAA TAATGGCGTG GCCAAGCACA TCAGCCGTTT CATCCTGCCC ATCGGCGCCA 1740 
CCGTCAACAT GGACGGTGCC CCGCTCTTCC AGTGCGTGGC CCCAG1GT PC ATTGCACAGC 1800 
TCAGCCAGCA GTCCTTGGAC TTCGTAAAG A TCATCACCAT CCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGO GGCAGCGGOC ATCCCTGCTG GAGGTGTOCT CACTCTGGCC ATCATCCTCO 1920 
AAGCAGTCAA CCTCCCGGTC G ACCATATCT OCTTG ATCCT GGCTGTGGAC TGGCTAGTCG 1980 
ACCGGTOCTO TACCGTOCTC AATGTAGAAO GTGACGCTCT GGGGGCAGGA CTCCIOCAAA 2040 
ATTATGTGGA CCGTACGGAG TCGAGAAGCA CAGAGCCIGA GTTGATACAA GTGAAOAGIO 2100 
AGCTGCCCCT GGATCCGCTQ CCAGTCCCCA CTGAGG AAGG AAACCOOCIC CTCAAACACT 2160 
ATOGGGGGCC OOCAGOOGAT GOCACGGTCO CCICTGAOAA GGAATCAGTC ATOTAAACCC 2220 
CGGGAGGGAC CTTOCCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATCAGGAATG 2280 
GATAAATGGA TG AGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGGA GCCAGGGGCC 2340 
CCAGCAOCCT CCAGGACAGQ AGATCTGGGA TGCCTGGCTO CTGGAGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACCC CCAGTTCTCA CTCATOTCCC CAACTCAAGG CTAGAAAACA 2460 
GCAAGATGGA GAAATAATGT TCIGCTGCGT CCCCACCGTG ACCIGCCTGG CCICCCCTGT 2520 
CTCAGGGAGC AGGTCACAGG TCACCATGGG GAATTCTAGC GCCCACIGGQ GGGATGTTAC 2580 
AACACCATGC TGGTTATTTT GGCGGCTGTA GTTGTGGGGG GATGTGTGTO TGCACOTQTG 2640 
TGIGTGTGTC TCTGTGTGTG TBTGTGTCTG TTCTGTGACC TCCTCTCCOC ATGGTACGTC 2700 
CCACCCTGTC CCCAGATCOC CTATTCCCTC CACAATAACA GAAACACTOC CAGGGACTCT 2760 
GGGG AGAGGC TGAGGACAAA TACCTGCTGT CACTCCAGAG GACATTTTTT TTAGCAATAA 2820 
AATTGAGTGT CAACTATTTA AAAAAAAAAA AAAAAA 



SEOIDNOcllB PFJ4 Proleln sequence: 
ProlelnAceesstaifc NP_0058l».l 

1 11 21 31 41 31 
I I I I I I 

MVADPPRDSK GLAAAEPTAN GGLALASED QGAAAGGYCG SRDQVRRCLR ANLLVLLTW 60 
AWAGVALGL GVSGAGGALA LGPERLSAFV FPGELLLRLL RMHIPLVVC SUGGAASLD 120 
PGALGRLGAW ALLFFLVTTL LAS ALGVGLA LALQPGAASA AINASVGAAG SAENAPSKEV 180 
LDSFLDLARN ffPSNLVSAA FRS YSTTYEE RNITGTRVKV PVGQEVEGMN ILGLWFAtV 240 
R3VALRKLGP EGELLIRFFN SFNEATMVLV SWftlWYAPVG IMFLVAGKtV EMEDVGLLFA 300 
RLGKY1LCCL LGHAIHGLLV LPUYFLFTR KNPYRFLWGI VTPLATAFGT SSSSATLPLM 360 
MKCVEENNG V AKHBRFUP 1G ATVNMDGA ALFQCVAAVF 1AQLSQQSLD FVK0TILVT 420 
ATASSVGAAG IPAGGVLTLA IILEAVNLPV DHISULAVD WLVDRSCTVL NVEGDALGAG 480 
LLQNYVDRTE 5R STEPEU Q VKSELPLDPL PVPTEEGNPL LKHYRGPAGD ATVASEKESV . 540 
M 



SE0 ID Nttltt PFJ3 DNA SEQUENCE 

Nucleic Add Accession I: NH_00S703 

Coding sequence 88442 (umMned sequent correspond la start and slop codons) 

1 11 21 31 41 51 
I I I I I I 

CTAGTTAAGG CGGCACAGGG OCGAOGCGTA GTGTGGGTGA CTCCTCOGTT CCTTGGGTOC 60 
CGTCGTCTGT GATACTGCAG TTCAGC CATG GCAGAACCGC AGCCCCCGTC CGGCGGCCTC 120 
ACGGAOGAGG CCGCOCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 
TTGCAGCAG A CCATGCTACG AGTGAAGQ AT CCTAAG AAGT CACTGG ATTT TTATACTAG A 240 
GTTCTTGGAA TGACGCTAAT CCAAAAATGT GATTTTCCCA TTATGAAGTT TTCACTCTAC 300 
TTCTTGGCrT ATGAGGATAA AAATGACATC CCTAAAGAAA AAGATGAAAA AATAGCCTGG 360 
GCGCTCTCCA GAAAAGCTAC ACTTG AGCTG ACACACAATT GGGGCACTGA AGATGATGCG 420 
ACCCAG AGTT ACCACAATGG CAATTCAGAC CCTCGAGGAT TCGGTCATAT TGGAATTOCT 480 
GTTCCTOATG TATACAGTGC TTCTAAAAGG TTFOAAGAAC TGGGAGTCAA ATTTOTGAAG 540 
AAACCTGATG ATGGTAAAAT GAAAGGCCTG GCATTTATTC AAGATOCTGA TGGCTACTGG 600 
ATTGAAATTTTGAATCCTAA CAAAATGGCA ACCTTAATGIjiQTGCrGTGA GAATTCTCCT 660 
TTGAGATTTC AGAAGAAAGG AAACAATGTO ATICAAGATA TTTACATACC AOAAGCATCT 720 
AGGACTGATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTPCC CCTTCCTATT 780 
TCAGCTGTTC CTTTTCACCT AACTGTTCAG TCATTCTGGT TTTCAAGCAG TGCTTTATCT 840 
CATGTCCTTG AATATAGTTG TGTAACT TTA TTTTTTAGGT AATAATTAGA ACAGTTCOCT 900 
TCAGAGGCTG CAT7TGCCTTCTTCTGCCAC CTAAATATTA CTTCCCTTCA AATCTGCCTT 960 
TG AATCATCA TTTTTAAAAA AAAAT TAACA TGTTTTTGTT GTAGTTATCT TCTGGGGTTT 1020 
CAATTCCTCA GAAACAACli' 1 1 1 1CACAAC GGAAAGGAAA GAACACTAGT ii 1 ' lCl ' l 1C AG 1080 
TAAAGTACAA AGTOTTTATT TTACAAAAGA GTAGGTACTC TTGAGAGCAA TTCAAATCAT 1140 
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GCTQACAAQG ATAC IOATAO AAAAAOTOAT T T CtlCT TAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGGACTAA CCTTATTTAT TTGGGAAAGG GGAGGAGG AA OG AAATOATA TGGTACCCAG 1260 
ACACTGGGCT AOGCTGCAAC TTTATCTCAT TTAATACTCC CAGCTGTCAT GTGAGAAAOA 1320 
AAGCAGGCTA GGCATGTGAA ATCACTTTCA TGGATTATTA ATGGATTTAA GAGGGCATCA 1380 
ATCAGCTCAA CTCAAGATTT CATAATCATT TTTAGTATTT AG ATTGTGCC TCAAAGTTGT 1440 
AGTAOCTCAC AATACCTOCA CTGGTTTOCT GTTOTAAAAA CCTTCAGTGA GTTTGACCAT 1500 
rGTGCTCTTG GCTCTTGGGC TGGAGTAOOQ TCGTGAGGGA OTAAACACTA GAAGTCTTTA 1560 
GTACAAAACT GCTCTAGGGA CACCTGOTGA TTCCTACACA AGTOATGTTr ATATTTCICA 1620 
TAAAGAGTCT TCOCTATOOC AAGGTCTTCA TGATGCCAGT AGOCATATAT GATAAATTAT 1680 
GTTCAGTGAT AACT TAGTTA TCAOAAATCA OCTCAGTGGT CTTCCCCGCC ATGATTCACA 1740 
TiTGATG AGT TTTTAAAAAT CAAAGTGATT TTG AAAATCT CTAATGGCTC AGAAAATAAA 1800 
AACATOCAGT TTGTGGATGA CTATATTTAG ATTTCTCTAG ACTCTAGTGG AAGAOCTTTG 1860 
GAAAGGCCAT GCCAACCGTO CTTGTACTGC TAGAAGCACT TTATGTTTCC TnTTOGGTG 1920 
AAATGGATTT ATGTGAGTGC TTTAAACAAA TAGCAATACT TATAG ACTGA AATAAAATGA 1980 
AACTTCAAATAAG 



SEQ ID Nft120 PFJ3 Protein secuencg 
Proan Accession t: NPJJ06699.1 

I II 21 31 41 51 
I I I I I I 

MAEPQPPSGG LTDEAALSCC SDADPSTKDF LLQQTMLRVK DPKKSLDFYT RVLGMTUQK 60 
CDFHMKFSL YFLAYEDKND IPKEKDEKIA WALSRKATLE LTHNWGTEDD ATQSYHNGNS 120 
DPRGFGHIGI A VPD VYS ACK RFEELGVKFV KKPDDGKMKG LAFKJDPDGY WIEONPNKM 180 
ATLM 



SEQ ID N&121 PFJ2 DNA SEQUENCE 

Nuckk Acid Accession* NM_002867 

Cooing sequence: 70-723 (undotlnedsequexes correspond to sbt and stop codans) 

1 11 21 31 41 51 
I I I I I I 

CCGAOGGCAG OTCCTGOCGT CCOGOCGACC GTCCGGGAGC GAACCCGTCG TCCCGCACTG 60 
G AGTCGG CGAJGjG CTTCA GT GACAOATGGT AAACATOGAG TCAAAGATGC CTCTGACCAG 120 
AATTTTGACT ACATGTTTAA ACTGCTTATC ATTGGCAACA GCAGTGTTGG CAAGACCTGC 180 
TTCCICTTGC GCTATGCIGA TGACACGTTC ACCCCAGCCT TCOTTAGCAC OGTGGGCATC 240 
GACXTCAAGG TG AAGACAGT CTACCGTCAC GAGAAGGGGG TJG AAACTGCA GATCTGGGAC 300 
ACAGCTGGGC AGGAGCGGTA CCGGACCATC ACAACAGCCT ATTACCGTGG GGCCATGGGC 360 
TTCATTCTGA TGTATGACAT CACCAATOAA O AGTOCTTCA ATGCTGTCCA AG ACTGGGCT 420 
ACTCAGATCA AGACCTACTC CTGGG ACAAT GCACAAGTTA TTCTGGTGGG GAACAAGTGT 480 
GA CATG OAGO AAGA GAGGGT TGTICCCACT GAGAAGGGOC AGCTCCTTCC AGAGCAGCTT 540 
GGGTTTGATT TCTTTGAAGC CAGTGCAAAG GAGAACATCA GTGTAAGGCA GGCCTTTGAG 600 
CGOCTGGTGG ATGCCATTTG TGACAAGATG TCTGATTCGC TGGACACAGA CCCGTCGATG 660 
CTGGGCICCT CCAAGAACAC GCGTCTCICG GACACGOCAC CGCTGCTGCA GCAGAACTGC 720 
TCATGCXAfiC AAGGGCCACC TTOCTGACCT CCCCTCATTO TGGOCOCACA CCCAAGTCTG 780 
CnCTCCCTG TTACACACTG TCCGCTCT 



SEQ ID KO-.122 PFJ2 Prdeh SMumee; 
Protein Accession I: NP_002858.1 

1 11 21 31 41 51 
I I I I I I 

MASVTDGKHG VKD ASDQNFD YMFKLIUGN SSVGKTSFLL RYADBIFTPA FVSTVGIDFK 60 
VKTVYRHEKR VKLQIWDTAG QERYRTITTA YYRGAMGFIL MYDUNEESF NAVQDWATQI 120 
KTYSWDNAQV ILVGNKCDME EERWPTEKG QLLAEQLGFD FFEASAKENI SVRQAFERLV 180 
DAICDKMSDS LDTDPSMLGS SKNTRLSDTP PLLQQNCSC 



SEOIDNO:123PFJ1 DNA SEQUENCE 

Nucleic Acid Accession I: NM.001844 

Coding sequence 1584621 (undetflned sequences concspond to start and stop codons) 

1 U 21 31 41 SI 
I I 1 I I I 

ACGCAG AGCG CTGCTGGGCT GCCGGGTCTC CCGCTTOCTC CICCTGCTCC AAGGGCCTCC 60 
TGCATOAGGG GGOGGTAGAG AGOCGGACCC GCGCCGTGCT OCTGCCGnT CGCTGGGCTC 120 
CGCCCGGGCC CGGCTCAGCC AGGCCCCGCG GTGAGCC&3Q ATTCGCCTCG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGAOGC TGCTOGTCGC CGCTOHxri 1 CGOTOTCAGG GCCAGGATGT 240 
CCAGGAGGCT GGCAGCTGTG TGCAGGATGG GCAGAGGTAT AATCATAAGG ATGTGTGG AA 300 
GCCGGAGCCC TGCOGGATCT GTGTCTGTGA CACTGGGACT GTCCTCTGCG ACGACATAAT 360 
CTGTG AAG AC GTGAAAG ACT GCCTCAGCCC TGAG ATCCCC TTCGG AGAGT GCTGCCCCAT 420 
CTGCCCAACT GACCTOGCCA CTGCCAGTGG GCAACCAGGA CCAAAGGGAC AGAAAGGAGA 480 
ACCTGGAGAC ATCAAGGATA TTGTAGGAOC CAAAGGACCT CCTGGGCCTC AGGGACCTGC 540 
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AGGGOAACAA GGACCCAGAG GGGATCGTGG TGACAAAGGT G AAAAAGGTG CCCCTGGACC 600 
TCGTGGCAGA GATCGAGAAC CTGGGACCCC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT CXXXCTGGTC TTGGTGGAAA CTTTGCTGCC CAGATGGCTG GAGGATTTGA 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA CCAATGGGCC CCATGGGACC 780 
5 TCGAGGACCT OCAGGOCCTG CAGGTGCTCC TGGGCCTCAA GGATTTCAAG GCAATCCTGG 840 
TQAACCTGGT GAACCTGGTG TCTCTGGTCC CATGGQTCCC COTGOTCCTC CTGGTCCCCC 900 
TGGAAAGCCT GGTGATGATG GTQAAGCTGG AAAACCTCGA AAAGCTGGTG AAAGGGGTCC 960 
GCCIGGTCCT CAGGGTGCTC GTGGTITCCC AGQAACCCCA GGCCTTCCTG GTGTCAAAGG 1020 
TCACAGAGGT TATCCAGGCC TGGACGGTGC TAAGGGAGAG GCGGGTGCTC CTGGTGTGAA 1080 

10 GGGTG AG AGT GGTTCCCCGG GTQAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGCCT 1140 
GCCTOOTOAA AGAGOACGOA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCCC 1260 
TGGTGCTCCT GGAGCCAAGG GTGAAGCCGG CCCCACTGGT GCCCGTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTGAAC CTGGTACTCC TGGGTCCCCT GGGCCTGCTG GTGCCTCCGG 1380 

15 TAACOCTGGA ACAGATGGAA TTOCTGGAGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 
TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAACAAGG 1360 
COCCAAGGG A GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCa CTGGTGAAGA 1620 

nn AGGCAAGAGA GGTGCCCGTG GAGAGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGGAGA 1680 

20 AAGAGGTGCT CCCGGAAACC GCGGTTTCCC AGGTCAAGAT GGTCTGGCAG GTCCCAAGGG 1740 
AGCCCCTGGA GAGCGAGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGCCCGGGGT CTCACTGGCC GCCCTGGTGA 1860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGGAGCCCCT OGTG AAGA TG GTCGTCCTGG 1920 
ACCTCCAGGT CCTCAGGGGG CTCOTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCCCCAA 1980 

25 AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT GAACGAGGCG AGCAGGGTGC TCCTGGGCCA TCTGGGTTCC AGOGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAO GTGAAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGGAGCC CCTGGCCTCG TGGGTCCCAG GGGTGAACGA GGTTTCCCAG GTGAACGTGG 2280 

30 CTCTCCCGGT GCCCAGGGCC TCCAGGQTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGCAGG CCCCCCTGGC GCACAGGGCC CTCCAGGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCGACAGGGG 2460 
TGACGTTGGT GAGAAAGGCC CTGAGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTGAC 2520 
AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 

35 TCCTGGTCCT GCAGGAAGTG CTGGTGCTCG TGGCGCTCCG GGTGAACGTG QAGAGACTGO 2640 
CCCCCCCGGA CCAGCGGGAT TTGCTGGGCC TCCTGGTGCT GATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAGAGGCCG GCCAGAAAGG CGATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGCA CCTGGGCCTC AGGGTCCTAC TGGAGTGACT GGTCCTAAAG GAGCCCGAGG 2820 
TGCCCAAGGC CCCCCGGGAG CCACTGGATT CCCTGGAGCT GCTGGCCGCG TTGGACCCCC 2880 

40 AGGCTCCAAT GGCAACCCTG GACCCCCTGG TCCCCCTGGT CCTTCTGGAA AAGATGGTCC 2940 
CAAAGGTGCT CGAGGAGACA GCGGCCCCCC TGGCCGAGCT GGTQAACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGGAGA GCCTGGAGAT GACGGTCCCT CTGGTGCCGA 3060 
AGGTCCACCA GGTCCCCAGG GTCTGGCTGG TCAGAGAGGC ATCGTCGGTC TGCCTGGGCA 3120 
ACGTGGTGAG AGAGGATTCC CTGGCTTGCC TGGCCCATCG GGTGAGCCCG GCAAGCAGGG 3180 

45 TGCTCCTGGA GCATCTGGAG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTGAC 3240 
GGGTCCTGCA GGTGAACCCG GACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTGATCG TGGTGAGACT GGTGCTGTGG GAGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 
AGGAGAAGCT GGTGCACAAG GCCCCATGGG ACCCTCAGGA CCAGCTGGAG CCCGGGGAAT 3480 

50 CCAGGGTCCTCAAGGCCCCA GAGGTGACAA AGGAGAGGCTGGAGAGCCTG GCGAGAGAGG 3540 
CCTGAAGGGA CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTGGTCCTTC 3600 
TGGAG ACCAA GGTGCTTCTG GTCCTGCTGG TCCTTCTGGC CCTAGAGGTC CTCCTGGCCC 3660 
CGTCGGTCCCTCTGGCAAAG ATGGTGCTAA TGGAATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGGA CGATCAGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 

55 TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 

GAGAGAGAAG GGCCCCGACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT GACGCCGAGG TGOATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGAA CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
CTGCCACCCT GAGTGGAAGA GTGGAGACTA CTGGATTGAC CCCAACCAAG GCTGCACCTT 4080 

60 GGACGCCATG AAGGTTTTCT GCAACATGGA GACTGGCGAG ACTTGCGTCT ACCCCAATCC 4140 
AGCAAACGTT CCCAAGAAG A ACTGGTGGAG CAGCAAGAGC AAGGAGAAGA AACACATCTG 4200 
GTTTGGAGAA ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATGACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
CATCACCTAC CACTGCAAGA ACAGCATTGC CTATCTGGAC GAAGCAGCTG GCAACCTCAA 4380 

65 GAAGGCCCTG CTCATCCAGG GCTCCAATGA CGTGGAGATC CGGGCAGAGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGAAACAT ACCGGTAAGT GGGGCAAGAC 4500 
TGTTATCGAG TACCGGTCAC AGAAGACCTC ACGCCTCCCC ATCATTGACA TTGCACCCAT 4560 
GGACATAGGA GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCTTCTTGIA 4620 
MAACCTGAA CCCAGAAACA ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 

70 AATCTCAGTC ACTCTAGGAC TCTGCACTGA ATGGCTG ACC TGACCTGATG TCCATTCATC 4740 
CCACCCTCTC ACAOTTCGGA CTTTTCTCCC CTCTCTTTCT A AOAGACCTC AA CTGG GCAG 4800 
ACTGC AAAAT AAAATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CAAGGCAGAG GCAGGAAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 
GCCCAGGCCA GAAGACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 

75 ATGGTGCTAT TCTGTGTCAA ACACCTCTGT All 1 1 1 1 AAA ACATCAATTO ATATTAAAAA 5040 
TGAAAAGATT ATTGGAAAGT 



SEQ ID MO-1?4 PF.IIPtoteln seouence: 
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Protein Accession!: NP_001K&2 

I 11 21 31 41 51 
I I I I I I 

MKLQAPQSL VLLTLLVAAV LRCQGQDVQB AGSCVQDGQR YNDKDVWKPB PCRICVCDTG 60 
TVLCDDUCE DVKDCLSPE1 PFGECCPICP TDLATASGQP GPKGQKGEPO DKDIVGPKG 120 
PPGPQGPAGE QQPRGDRGDK OEKGAPQFRO RDGEPGTPGN PGPPGPPOPP GPPGLGGNFA 180 
AQMAGGFDEK AGGAQLGVMQ GPMGPMGPRG PPGPAGAPGP QGPQGNPGEP GEPGVSGPMG 240 
PRGPPGPPGK PGDDGEAGKP GKAQERGPPO PQGARGFPGT PGLPGVKGHR GYPGLDGAKG 300 
EAGAPG VKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPGA PG AKGEAGPT GARGPEOAQG PRGEPGTPGS PGPAGASGNP GTDGIPGAKO 420 
SAGAPG1AGA PGFPGPRGPP GPQGATGPLG PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG 480 
• APGPAGEEGK RGARGEPGGV GPIGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 600 
VMGFPGPKGA NGEFGKAGEK GLPOAPGLRG LPGKDGETGA AGPPGPAGPA GERGEQGAPG 660 
PSGPQGLPGP PGPPGEGGKP GDQG VPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KGASGPAGPP GAQGPPGLQG MPGERGAAG1 AGFKGDKGDV GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAGAN GEKGEVOPPG PAGSAGARGA PGERGETGPP GPAGFAGPPO 840 
ADGQPGAKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG 900 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSGAEGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAO VKGDRGE TOAVGAPGAP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLOG 1140 
LPGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG 1200 
PPGNPGPPGP PGPPGPGIDM SAFAGLGPRB KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQESIR SPEGSRKNPA RTCRDUCLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYFNPAN VPKKNWWSSK SKEKKHIWPG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNUCKA LUQGSNDVE IRAEGNSRFT YTALKDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PKDIAPMDI GGPEQEFGVD IGPVCFL 



SEQ ID Hai25 PFH9 DMA SEQUENCE 

Kudeic Add Accession*: NM.005084 

CoaTnp, sequence 162-1 487(underilned sequences correspond to start and stop radons) 

1 11 21 31 41 51 
I I I I I I 

GCTGGTCGGA GGCTOGCAGT GCTGTCGGCG AGAAGCAGTC GGGTTTGGAG CGCTTGGGTC 60 
GCGTFGGTGC GCGGTGGAAC GCGCCCAGGG ACCCCAGTTC GCGCGAGCAG CTCCGCGOOG 120 
CGCCTGAGAG ACTAAGCTG A AACTGCTGCT CAGCTCGCAA GAT£GTGCCA CCCAAATTGC 180 
ATGTG CI 1 1 1 CTGCCTCTGC GGCTGCCTGG CTGTGGTTTA TCCTTTTGAC TGGCAATACA 240 
TAAATCCTGT TGCCCATATG AAATCATCAG CATGGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGCCAA ACTAAAATCC CCCGGGG AAA TGGGCCTTAT TCCGTTGGTT 360 
GTACAGACTT AATGTTTGAT CACACTAATA AGGGCACCTT CTTGCGTTTA TATTATCCAT 420 
CCCAAGATAA TGATCGCCTT GACACCCTTT GGATCCCAAA TAAAG AATAT TTTTGGGGTC 480 
TTAGCAAATT TCTTGG AACA CACTGGCTTA TGGGCAACAT TTTGAGGTTA CTCTTTGGTT 540 
CAATGACAAC TCCTGCAAAC TCGAATTCCC CTCTO AGGCC TGGTGAAAAA TATCCACTTO 600 
TT U1 l i mC TCATGGTCTT GGGGCATTCA GGACACTTTA TTCTGCTATT GGCATTGACC 660 
TGGCATCTCA TGGGTTTATA GTTGCTGCTG TAGAACACAG AGATAGATCT GCATCTGCAA 720 
CTTACTATTT CAAGG ACCAA TCTGCTGCAG AAATAGGGGA CAAGTCITGG CICTAOCTTA 780 
GAAOCCTGAA ACAAGAGGAG GAGACACATA TACGAAATQA GCAGGTACGG CAAAGAGCAA 840 
AAGAATGTTC CCAAGCTCTC AGTCTGATTC TTGACATTGA TCATGGAAAG CCAGTGAAGA 900 
ATGCATTAGA TTTAAAGTTT GATATGGAAC AACTGAAGGA CTCTATTGAT AGGGAAAAAA 960 
TAGCAGTAAT TGGACATTCT TTTGGTGGAG CAACGGTTAT TCAGACTCTT AGTGAAGATC 1020 
AGAGATTCAG ATGTGGTATT GCCCTGGATG CATGGATGTT TCCACTGGGT GATGAAGTAT 1080 
ATTCCAGAAT TCCTCAGCCC CTCTnTTTA TCAACTCTG A ATATTTCCAA TATCCTGCTA 1140 
ATATCATAAA AATGAAAAAA TGCTACTCAC CTG ATAA AGA AAGAAAGATG ATTACAATCA 1200 
GGGGTTCAGT CCACCAGAAT TTTGCTGACT TCACTTTTGC AACTGGCAAA ATAATTGGAC 1260 
ACATGCTCAA ATTAAAGGG A GACATAGATT CAAATGTAGC TATTG ATCTT AGCAACAAAG 1320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTGATTGA AGGAGATGAT GAGAATCTTA TTCCAGGGAC CAACATTAAC ACAACCAATC 1440 
AACACATCAT GTTACAGAAC TCTTCAGG AA TAGAGAAATA CAA TTAGG AT TAAAATAGGT 1500 
TUll 



SEQ ID 110:126 PFH9 Protein sequence: 
Protein Accession i: NP_00507£1 

1 11 21 31 41 51 

mvpp'klhvlf CLCGCLAWY pfdwqyinpv ahmkssawvn KIQVLMAAAS PGQTKIPRGN 60 
GPYSVGCTDL MFDHTNKGTF LRLYYPSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGN1 120 
LRLLFGSMTT PANWNSPLRP GEKYPLWFS HGLGAFRTLY SAIGIDLASH GFIVAAVEHR 180 
DRSAS ATYYF KDQSAAEIGD KS WLYLRTLK QEEEIHKNE QVRQRAKECS QALSULOID 240 
HGKPVKNALD LKFDMEQLKD SCDREKIAVI GHSFGGATVI QTLSEDQRFR CGIALDAWMF 300 
PLGDEVYSRI PQPLFFINSE YPQYPANUK MKKCYSPDKE RKMTTIRGS V HQNFADFTFA 360 
TGKUGHMIJC LKGDIDSNVA IDLSNKASLA FLQKHLGLHK DFDQWDCUE GDDENUPGT 420 
MNTTNQHIM LQNSSGEKY N 
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SEQ ID HQ-.W PFH8 DNA SEQUENCE 

Nucleic Add Accession!: NM.015300 

Coding sequence 32-1402 (underlined sequences correspond to start anil stop codons) 

1 H 21 31 41 51 
I I I I I I 

CACGAGCGGC ACGAGGATTT CCAGCTCAGC GAJGCCCCCA GGTCCCTGGG AGAGCTGCTT £0 
CTGGGTGGGG GGCCTCATTT TQTGGCTCAG CGTTGGAAGT TCAGOGOATG CACCTCCTAC 120 
CCCACA OCCA AAGTGCGCTG ACTTCCAGAG CCCCAACCTT TTTGAAGGCA CCGATCTCAA 180 
AGTCCAGTTT CTCCTCTTTG TCCCTTCGAA TCCTAGCTGT GGGCAGCTAG TAG AAGGAAG 240 
CAGTGAGCTC CAAAACTCTG GGTTCAATGC CACTCTGGGA ACCAAACTAA TTATCCATOG 300 
ATTCAGGGTT TTAGGAACAA AGCCTTCCTG GATTGACACA TTTATTAGAA CCCTTCTGCG 360 
TGCAACGAAT GCTAATGTGA TTGCCGTGGA CTGG ATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTGA TTAAGTTGAG OCTOG AGATC TGOCmTOC TCAATAAACT 480 
CCTGGTGCTG GGTGTGTCGG AATCCTCAAT OCACATCATT GGTGTTAGCC TGGGGGCCCA 540 
CGTTGGGGGC ATGGTGGGAC AGCTCTTGGG AGGOCAGCTG GGACAGATCA CAGGCC1GGA 600 
CCCCGCTGGA CCTGAGTACA CCAGGGCCAG TOTGGAAGAG CGCTTGGATG CIGGAGATGC 660 
CCICTTOGTG GAAGCCATCC ACACAGACAC GGACAATTTG GGTATTCGGA TTOOOGTTGG 720 
ACATGTGGAC TACTTCGTCA ACGGAGGCCA AG ACCAAOCT GGCTCOCCCA CCTTCTTTTA 780 
CGCAGGTTAT AGTTATCTGA TCTGTGATCA CATGAGGGCT GTCCACCTCT ACATCAGCGC 840 
CCTGGAGAAT TCCTGTDCAC TGATGGOCTT TCCCTGTGCC AGCTACAAGG CCTTOCTTGC 900 
TGGACGCTGT CTGGATTGCT TTAACCCTTT TCTGCTTTCC TGCCCAAGGA TAGGACTGGT 960 
GGAACAAGGT GGTGTCAAGA TAGAGCOGCTCCCCAAGGAA GTGAAAGTCT ACCTCCTGAC 1020 
TACTTCCAGT GCTCCGTACT GCATGCATCA CAGCCTCGTG GAGTTTCACT TGAAGG AACT 1080 
GAGAAACAAG GACACCAACA TCGAGGTTAC CTTCCTTAGC AGTAACATCA CCICTTCATC 1140 
TAAGATCACC ATACCTAAGC AGCAACGCTA TGGGAAAGGA ATCATAGCCC ATGCCACCCC 1200 
ACAATGGCAG ATAAACCAAG TGAAATTCAA GTTTCAGTCT TCCAACCGAG TTTGGAAAAA 1260 
AGACCGGACT ACCATTATTG GG AAGTTCTG CACTGOCCIT TTGCCTGTCA ATGACAGAGA 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
CCTGAAGATA GCCTGTGTOTAGTTTAACCT GGGCAGGACA CATCTCCCTG CATH.THT1 1440 
TTTTmTTT GAGAGAGAGG TGTGATGAGG GATGTOTGTG TGCAGCTTAT TGTAGACCAT 1500 
TACTACTAAG GAGAA AAGC A AAGCTCmC TTATTTTCCT CATAATCAGC TACCCTGGAG 1560 
GGGA G GGAG A ACTCATTTTA CAGAACTTGG 1 MUCll 1GC OGATCTTATG TACATACCCA 1620 
TnTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CICCTTGGGC ATTCGTACTT 1680 
AGGATTCAAT AG AAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAAA 



SEQ 10 HO:128PfflB Protein swuencs 
Protein Accession*: NP_056984.1 

1 11 21 31 41 51 
I I I I I I 

MFPGPWESCF WVGGULWLS VGSSGDAPPT FQPKCADFQS ANLFEGTDLK VQFLLFVPSN 60 
PSCGQLVEGS SDLQNSGFNA TLGTKLHHG FRVLGTKPSW IDTFIRTLLR ATNANVIAVD 120 
WIYGSTGVYF SAVKNVIKLS LEISLFLNKL LVLGVSE5SI HUGVSLGAH VGGMVGQLFG 180 
GQLGQrTGLD PAGPEYTRAS VEERLDAGOA LFVEAIHIDT DNLGIRIPVG HVDYFVNGGQ 240 
DQPGCPTFFY AGYSYLICDH MRAVHLYIS A LENSCP1MAF PCASYKAFLA GRCLDCFNPF 300 
LLSCPR1GLV EQGGVKIEPL PKEVKVYLLT TSS APYCMHH SLVEFHLKEL RNKDTNIEVT 360 
FLSSNTTSSS KITIPKQQRY GKGHAHATP QCQINQVKFK FQSSNRVWKK DRTTnGKFC 420 
TALLPVNDRE KMVCLPEPVN LQASVTVSCD LIOACV 



SEQ ID N&129 PFH7DMA SEQUENCE 

. Nudelc Add Accession t NMJ14384 
CodJtg sequence: 89-1 336 (mdeitned sequences cotrespond to start and flop codons) 



1 11 21 31 41 51 
I I I I I I 

CGTTGCOGGG TCGCAGGTCC CGCCAGTGCG AGCGCAACGG AGGTCGAAGG GGTTCAGACT 60 
CTTAGCTGAA CGCGGAGCTG CGGCGGCTAXCCTCTGG AGC GGCTGCCGGC GTTTCGGGGC 120 
GCGCCTCGGC TGCCTGCCCG GCGGTCTCCG GGTCCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GAC CTCCT GC ATCGACCCTT CCATGGGACT TAATGAAGAG CAGAAAGAAT TTCAAAAAGT 240 
GGOCTTTGAC TTTGCTGCCC GAGAGATGGC TOCAAATATQ GCAGAGTGGG ACCAGAAGGA 300 
GCTGTTCCCA GTGGATGTGA TGCGGAAGGC AGCCCAGCTA GGCTTCGGAG GGGTCTACAT 360 
ACAAACAGAT GTGGGCGGGT CTGGGCTGTC ACGTCTTGAT ACCTCTGTCA TTTTTGAAGC 420 
CTTGGCTACA GGCTOCACCA GCACCACAGC CTATATAAGC ATCCACAACA TGTGTGCCTG 480 
GATGATTGAT AGCTTCGGAA ATGAGGAACA GAGGCACAAA TTTTGCCCAC CGCTCTGTAC 540 
CATXjOAGAAO TTTGCTTCCT ACTGCCTCAC TGAACCAGGA AGTGGGAGTG ATGCTGCCTC 600 
TCTTCTGACC TCCGCTAAGA AACAGGGAGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGOTGAGT CAG ACATCTA TGTGGTCATG TGCCGAACAG G AGGACCAGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTCA GAAGGGGACC CCTGGCCTCA GCTrTGGCAA 780 
GAAGGAGAAA AAGGTGGGGT GGAACTCCCA GCCAACACGA GCTGTGATCT TCGAAG ACTG 840 
TGCTGTCCCT GTGGCCAACA GAATIGGGAG CGAGGGGCAG GGCnCCTCA TTGCCGTGAG 900 
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AGGACIG AAC GG AGGG AGGA TCAATATTGC TTCCTGCTCC CTGGGGGCTG CCCACGCCTC 960 
TCTCATOCTC ACCCG AG ACC ACCTCAATQT CCGOAAGCAO TTTGGAOAGC CTCTGGCCAG 1020 
TAAOCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG CCGCGCGGCT 1080 
GATGGTCCGC AATCCAGCAG TGGCTCTGCA GQAOOAQAGO AAGGATGCAG TGGCCTTOTO 1140 
CTCCATGGOC AAO C1C1 1 lO CTACAGATGA ATGCTTTGCC ATCTGCAACC AGGCCTTGCA 1200 
GATGCAOGGG GGCTACGGCT ACCTO AAGGA TTAOGCTOTT CAGCAGTAOG TGCGGGACTC 1260 
CAGGGTCCAC CAGATTCTAG AAGGTAGCAA TGAAGTGATG AGGATACTOA TCTCTAQAAG 1320 
CCTGCTTCAG GAGTAGAACC CACACTTGTT CTGGCCTGGT GTTCAGTGCG ACTGCAGTCA 1380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTCCAAAGGA ATCATGGATT AGACCCAAGG 1440 
GCTGAGCTOC TCTAGGGCAG GACCTGCACC CTGTGTGTTG GCACCAGCAT OGGGTCTTGG 1500 
ACTGGGGCAG AATCOCCAGT GGAACCGGAA GAGCTGGACT GATGAGAAAC ATCAGAAGAA 1560 
CACATACTAC C T TO ITTTC C TAATGOCAOA AOOOTGACCA GTGAAOATTC ACOGTCAAAC 1620 
CATGAAAGTC C111C11GGA TCCACTTTAT CTTGATTAGT CTGCATTTTA CTAGTTCACT 1680 
GG ATCCCTCC TCTAGGGOCC TGGGGACTTT CACTGATGCT CITOCTGATT CTAGAGCAAA 1740 
GGTGTGGGAA GGGG AAATGG AGGAATGCCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1800 
TACAG ATGCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGT AAATCAT GATAAAATGG 1860 
ATATTTGGAA ACTTACTCCT AAGCTGTG AT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTG AGACT TTTGAATGTT GAATATTCGT TGGGTTICAT GTTAAGACGC 1980 
CTGTGGTCCA GGAGTGCTAT TCAGTGTTTC TGTTCCTGAT AAACACTTTG AATATTTTTT 2040 
TGTGTTTTTG TTTOCTTTTC TGAAGCTGTT CCTOCTnTA AATATTTTTA ATCACATTGA 2100 
TAAAATCTAT CCTTCATCXA CCTCTGGTTC TACTATAGTT GA'l'liii ATT TTAAATGTTT 2160 
AATrGTATTT GATTAAACAC TTAACTGOAT TTTGGAATAA TAAAACTCTC GTCCAATTTG 2220 
GCTTTTAAAA AAAAAAAA 



SEQIOWOiiaOPnffProldnseniieficg 
Protein Accession*: NP_055I99.1 



1 II 21 31 41 51 
I I I I I I 

MLWSGCRRPG ARLGCLPGGL RVLVQTGHRS LTSODPSMG LNEEQKEPQK VAFDFAAREM 60 
APNMAEWDQK ELFPVDVMRK AAQLGPGGVY IQTDVGGSGL SRLDTSVIFE ALATGCTSTT 120 
AYIS IHNMCA WMIDSFGNEE QRHKFCPPljC TMEKFAS YCL TEPGSGSDAA SLLTS AKKQG 180 
DHYILNGSKA FISGAGESDI YWMCRTGGP GPKGISOW EKGTPGLSFG KKEKKVGWNS 240 
QFTRAV1FED CAVPVANRIG SEGQGFLIAV RGLNGGRINI ASCSLGAAHA SVH.TRDHLN 300 
VRKQFGEPLA SNQ YLQFTLA DMATRLVA AR LMVRNAAVAL QEERKD A VAL CSMAKLFATD 360 
ECFAICNQAL QMHGGYGYLK DYAVQQYVRD SRVHQILEGS NEVMREJBR SLLQE 



SEQ ID N0:131 PFH6 DMA SEQUENCE 

Nucleic Add Accession* KM.013989 

Coding sequence: 707-1 tOSfimdertined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCCTGCAGAG AGAGGCACTT TGCACCACAG ACAGATAGCA AGAAGGGAAA GACAGAGAGT 60 
GAGAAAAAAG AGGAGTCAGT CGCTCCTGGG GAAGGGAGAG AGTGAGACTG GGAGAAAGAG 120 
AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACGCTTAAA 180 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAGAAAGA AACAGGCTAC GTTTAAAG AG 240 
CATAG AGACA ATGAAAGGCT AAAGAAAATT TTAAAATCTC TGCCACAGTC TCATAGGTGC 300 
TTGGAAATGA AAGTAGAACT GCCTGTCTTT AACGGACTCT GACAGAGGTA ACTGGATTAG 360 
GGACGAGTAC GCCAGCTTT 1 1 1 1 1 1 1 1 1 1 1 11 1111111 I 1 1TAACATCT TAAATCCTGA 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTCCGAAT TGAATGAATT GATGGGCACA 480 
CTCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC A 'l ITCTGCTT CTTTGAAAGA 540 
GGAGACAACT TGGGCTTCCT TTTAATTTAG TTTTTTTTCC CCTTCTCOOC CAACCCCCAA 600 
OCTTCCCCCr TACCTGCCCC ACCCCCTTTA TCAGCACCCC CCTTTTAAAT AAGAGGGTGA 660 
AGGGGAACCA GAGCGCACAA GGGAACTGAC TCAGGAGGCA G AG AAGATGG GCATCCTCAG 720 
CGTAGACTTG CTGATCACAC TGCAAATTCT GCCAGTTTI 1 TTCTCCAACT GCCTCTTOCT 780 
GGCTCTCTAT GACTCGGTCA TTCTGCTCAA GCACGTGGTG CTGCTGTTGA GCCGCTCCAA 840 
GTCCACTCGC GGAGAGTGGC GGCGCATGCT GACCTCAGAG GGACTGCGCT GCGTCTGGAA 900 
GAGCTTCCTC CTOGATGOCT ACAAACAGGT G AAATTGGGT GAGGATGCCC CCAATTCCAG 960 
TGTGGTGCAT GTCTGCAGTA CAGAAGGAGG TGACAACAGT GGCAATGGTA CCCAGGAGAA 1020 
GATAGCTGAG GG AGCCACAT GOCAOCTTCT TGACTTTGCC AGCCCTGAGC GCOCACTAGT 1080 
GGTCAACTTT GGCTCAGCCA CTJQACCTCC TTTCACGAGC CAGCTGOCAG CCTTCCGCAA 1140 
ACTGGTGGAA GAGTTCTCCT CACTGGCTGA CITCCTGCTG GTCTACATTG ATGAGGCTCA 1200 
TCCATCAGAT GGCTGGGCOA TACCGGGGGA CTCCTCTTTG T CI II 1G AGG TGAAGAAGCA 1260 
CCAGAACCAG GAAGATCGAT GTGCAGCAGC CCAGCAGCTT CTGGAGCGTT TCTCCTTGCC 1320 
GCCCCAGTGC CGAGTTGTGG CTGACCGCAT GGACAATAAC GCCAACATAG CTTACGGGGT 1380 
AGCCTTTGAA CGTGTGTGCA TTGTGCAGAG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCnCICC TACAACCTTC AAGAAGTCCG GCATTGGCTG GAG AAGAATT TCAGCAAG AG 1500 
ATGAAAGAAA ACTAGATTAG CTGGTTAAAG GTATGATTAT AAGAG AGCTT ATTG 1 1 ' ITA A 1560 
AAAGTTATAT AAAGGCAAGG AAATTAAGAA CTG AATCCAT ATTTCAACAG AGCCCTATTG 1620 
GCTTACTGAA AGACAGGAGT TTATCTATCG GAAGAACATG AATCTCTAAC AGCTCCATAC 1680 
TTCTTTCACT ACTCAAATGG CATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAAAAGCCC TATGTGAAAA GATCCCAAGA TGGAGAGGAA GAAACGCTAA TTCAGCATGT 1800 
GTTCATTCTG CATTGAGAAG GAACTGATAC ATCTGATGCA TGCTTTGAGA CCAGAAGAAA 1860 
AGACTTACCT GAATAATTAC TACATTAGGG AAGCTACTGT CTACGTTAAG ATAAAGGGTA 1920 
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TTGCCITGGC TCTATTTGGC ATGGATGGAG CCCAGTTOQA AAATTCCCAA ATATTACAAC 1980 
AAGTCCTTG A ACCCAGGCCA TOTCGTTAGA CGTTGGTGTT AAGGTTAGAC CITATGTTAO 2040 
AGTCATCTCT GATGTTCCAG CTTCTAGCCA TQTAGTOCTC TCAGTCTTCA TAOCCCAGAA 2100 
ATTATTGGTA TATTTGTAGA TAOCGAGAAT GATOOCICAG TCTOAGAGGTTAGAATOATC 2160 
ATCTGTAATC TGAGGGTTAA 11 1LTAGGCA GGTGGAGAQA OTGGTAAAAA AGAAATOAAA 2220 
T TOACAA GCT AGGAAAGAGG AGGCAGAAAG ATTTGGAAAA TTCACAGAGT TTCACCCTTA 2280 
AGCTGTAGAG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAAAGAAG AAGGAGCTCA ACTAAAAGTG GCATAGAGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAGA GAGGAGAAAG GOGTGATIGA AAOAAAAAAA AATACTTAAA 2460 
TATTTGTAAT TGTGAGGGGT TTCTTTTGGA AATAATTACT TTTOAACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATGATACCTA TTAAAGGAAA ACCAGTGGOT 2580 
CTGGTGGTGC TGGTCTTTTC CTCCCCATTC CTACAATTTC TATGTGGCCC AAGTCATTCC 2640 
TAATCITGGT CTCTATAGCA GTGTTCTCIC TGAATGCTGA GCTGAAGAAA TTATACGTAC 2700 
ATACACACAT ACATACATAC ATACAAATAT ATGTATATAT ATTCTCAGCT GCTGOGGGAG 2760 
GTAGGTACCA TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT GAGCTATAGT 2820 
GAAGAATAGG TGCAAACAAA CAAGCTTACT TCCATTGCAA AATAGAAGAA GAGGAAGTTA 2880 
GAG ATAATTC TGATCAATCA TTTTGGAGGC TTTGTTATAA GGCAAOCCCC GGTATATCAT 2940 
GGAATTTCCA TTGACATTTG AATTTGGACT TGGATCTTCC CTTGGTCCCA TTAGCTGAGG 3000 
TTTAGTAATC TAA AGTCOCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 
TAAAATATTT TTTTCTTTTT AAAATAGACA CTATAGTTTT ACCCATAAGT AATATTTAAA 3120 
GATTATAGCT CCCAAAAGAA TGGACCAACC ACTTTCGTAT CATAA'l 1TCT TTTTGGTAAA 3180 
TATGAGACTA TTATGAAATC ATAGTATATG ATTOTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAAA AATAAATAAA ATGGATAGAA AAAAACTAAA 3300 
GTTGAAAATA CATTC TTAAA CTAGTTGTCT OAAATGAOAA AAOAGTG AGA ACTAGGTGTG 3360 
CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGGAG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGGAC TQATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CCACCCACAA AGAAACAAAG CAAATTTCAT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AGAATAATTA TTTCAGATTC CTCA OUOU 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGCC CCCATTGATT TTTAACCICA AAATGGTGTG 3660 
AGATTTACTG TGGAACCCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TCTTGTCCTT AAAATTCCCC TTITTTCTCT ATGTACG ATA AAGTAACAGT ATGTCAGATA 3780 
AGCCGGTGGG GGGATGAGAT TAGGCTGAGG CAGTGCTAGT CAACTGGGGQ AAAAGOATGA 3840 
TGGAAAAATC ACCCAGTTGT GCTATATTTT TAAAGAAGG A GGTCGTTTAT GTGTGCAGAC 3900 
AATTCTCCCT GAGGTTAGCC CAATGGAGAA ATGAAGCAGA GGAAGGAAAC ATAGAAAGAC 3960 
ATGGGCTATC AGGGAGGAAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGGAAGGGC CAATGGAGAA AATGAATGGA CAAAGCTCAG GAATCCCTAC GCTATGTAGA 4080 
AT GTTCT TGG TGTTATCAGG GTTAAGCCCT GTAATTATGT AACCTATTTA TOGCAACATG 4140 
AAI1IIIATG ATTTCTTGTG ATGTATTCTT TTATGAAATT AACAAGAACT CATTATTTTG 4200 
AGGTAGAGGA AAATCAATGC TTTATCTGAT ATGCTGAGAA ATTATTAG AT TGCCAATACT 4260 
CATGTGCGTT TCATGTGTTT TATAAGGTTT GTTCCnTG A AG AATTGTAG TTCTTAGTCC 4320 
CACAGGGAAA TGTGTATCTA TTTATATATC ATAOTATAAA TCTATGATAT ATTTATATCA 4380 
TATATAAAAG TCTGAGTTCT CTT TCTTAG T CCCTAATCAT GTTTCTCCCA TAGGCTGTGT 4440 . 
TTACATGGAG CTATCGGTTT AOCCnTTAA GCTTCATTAG CTTGTCTATT ATTG AAATAG 4500 
TTTCCAAG AA ATTTTAG ATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTGTTTG 4560 
AAAGACTTAT GTCTTGG ACC TATCAAAAAC TOACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGG ATCA ACAATGATTT TCTTGAATGG GCATGAATGG AGATGCCCGC ACAGTAATGT 4680 
AGAAATGTTT CATACAGCTA TTAAAATGTA ACTGACCTCC TTAGAGGCAG ATTAOTAACT 4740 
GTrCCTACTT TGTATAGCTA AGTGACAGTC ACTTAACTTA CATGACTTTC UiUlC ACA 4800 
TTGGGTCTCT GGTCCTGTGT CTICACCTCA TTTATAGCAC GTCTCCTTG A TTTTTGGTAG 4860 
TATCAACTTC CCAGTGATCT GTTCAGTTAA GTTCTTCTCC CGTTAACCAG GAAGTGCTTA 4920 
TTCTCTCATC ACAGTGGGAA GAATAGCCTA TTGTCTTTCA TTTTGCCTGA GTGTATTTTA 4980 
CTATTTGGGC TCTGAAATAA AAATTATGAA ATATGGTGAG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGGA GGGCAGGTTA GGAGACAGTT ATGTATGGGC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 
AGCAAG AAGA ATTG ACTG AT TTACAGG ACT T C1C1 1 ' IA TG TCAATCTTAA GAGGATGG AT 5220 
GAATCTGGAC ATTTGTTCCA CCCGACCTCT GACTOATGGT TTGGAAAATA ACTTTAATTA 5280 
GGATCATATG ACCATTGAAA AAGGAAAAAT GTAGACTCTG ACTTCCGTCC CACTGAAGGA 5340 
TTAATGAAAA CCTTTACTAO CATTTAGAGC TTTTCAGAAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA GACTGCAAGT AAGGCTTTTA ATTTTAGGAG G l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5460 
TTCCCCTAAA TGGTATGGCC AAAAGTCAGA GTTAAAATAT ATATAGTTAG ATTCGAACTT 5520 
CCTCCTTCAC TCTAAAAATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCT GAGGCTCTCT 5640 
AAGAAAAGAG AGGATCTAGG ATGGGAGAGC TAGAAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTGAGGGGT TGGTCTACCA ATCTGGGAAG ATTTGAAAAC AAACTTCTCG CAACTGAAGG 5760 
AAGGCTQ AAG GCTGCTGCAA GTCATTGAGT GACTTTAGGA TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTAATGC CCTATGTGTA TAGTACCAG A AGCAAGGTCT CAG ACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTGA ACCAATAGAA AGCAAACATG TGCAGATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCT GGCTACCCGT CTTAGGCAGC AACAGCAGAG CTCCAGGGAG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAGAT GTTTAATGAA GTCACTATTT 6060 
TGGCTCAAAC CCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAGAAACC CACAGA AGGG GATGGGAAAT AAAOAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTCAGCGTG GTnTTATGT GTATTAGGAT TGGGGGATGT GAAGAAATAA 6240 
GTATCCAG TA CTTTATAACC AAAGCAATTA AATGATATTG GGGTAGGGAA TGTTGGCCAG 6300 
TTTTGTTTAG TTTTGCCATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 
CGCCCCGAAG AGGGAGACAG AGATGTGCCA OAGTTGACCC AGTGTGCGGA TGATAACTAC 6420 
TGA CGAAAG A GTCATCGACC TCAGTTAGTG GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTG TCTCCCTGGC AAGGAGAATA TGCGGGACAT GATGCTAAGA GCCCTGGGTA 6540 
AATGTGGTGA GAATGCACGC GTGCATATGC TACACATATG TGCTTCTCAG TTGCAGAAAA 6600 
TGAACTGCTT TGGGAGATTA TCAGTAGAAA GAGTGTTATC ATATTGGTGC TG AGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATTTT AATAAACTTT GAATAAAAQA ATAAAAAAAA 6720 
AAAAAAAAAA AAAAA 



SEQ ID WftlMP FHfi Protein SMueneg 
ftoieln Accession #: NP_054644.1 

1 11-21 31 41 51 
I I I I I I 

MGILSVDLU TLQILFVFFS NCLFLALYDS VILLKHWU. LSRSKSTRGB WRRMLTSEGL 60 
RCVWKSFLLD AYKQVKLGED APNSSWHVS STEGGDNSGN GTQEfOAEGA TCHLLDFASP 120 
ERPLWNFQS ATXPPFTSQL PAFRKLVEEF SS VADFLLVY IDEAHPSDOW AIPGDSSLSF 180 
BVKKHQNQED RCAAAQQLLB RFSLPPQCRV VADRMDNNAN IAYGVAFERV CJVQRQK1AY 240 
LGGKGPF5YN LQEVRHWLEK NFSKRXKKTR LAG 



SEQ ID N0:133 PFH5 DNA SEQUENCE 

NudeJc Add Accession I: NM.001M1 

Coding sequence 722102 (underflnedseouenceacorrespoiri to start 

1 11 21 31 41 31 
I I I I I I 

CAGGCGTOTC CCAGGGGGAG CCCCGCTCTG CAGOCCTOTQ CGCCGTAGAG AGCTGGACTT 60 
AGGCIGGCAG CAIQGCCGAG TTCAGGGTCA GGGTGTCCAC CGGAGAAGCC TTCGGGGCTG 120 
GCACATGGGA CAAAGTGTCT GTCAGCATCG TGGGGAOOCG GGGAGAGAGC CCCCCACTGC ISO 
OOCTGGACAA TCTCGGCAAG GAGTTCACTG CGGGCGCTGA GG AGGACTTC CAGGTGACGC 240 
TOCCGGAGGA CGTAGGCCGA GTGCTGCTGC TGCGCGTGCA CAAGGCGCCC CCAOTGCTGC 300 
CCCTGCTGGG GCCCCTGGCC COGGATGOCT GGTTCTGCGG CTGGTTCCAG CTGACACCGC 360 
CGCGGGGCGG (XACCTCCTC TTCCCCTGCT ACCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAGAOCA CCACCCTGTG CTCCAGCAAC 480 
AGOGCCAGGA GGAGCTFCAG GCCCGGCAGG AGATGTACCA GTGGAAGGCT TACAACCCAG 540 
GTTGGCCTCA CTGCCTGGAT GAAAAGACAG TGGAAGACIT GGAGCTCAAT ATCAAATACT 600 
CCACAGCCAA GAATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAGATGAAAA 660 
TCAAGGGGTT GCTGGACCGC AAGGGGCTCT GGAGGAGTCT GAATGAGATG AAAAGGATCT 720 
TCAACTTGCG GAGGACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTOGOCIC CCAGTICCTG AATGGTCTCA ACCCTGTCCT GATCCGCCGC TGTCACTACC 840 
TCCCAAAGAA CTTCCCCGTC ACTGATGCCA TGGTGGCCTC ATTGTTGGGT CCIGGGACCA 900 
GCITGCAGGC TGAGCTAGAG AAGGGCTCCC TGTTCTTGGT GGATCACGGC ATCCTCTCTG 960 
GCATCCAGAC CAATGTCATT AATGGGAAGC CGCAGTTCTC TGCGGCCCCA ATGACCCTGC 1020 
TATACCAGAG CXXAGGCTGC GGGCCGCTGC TGCCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCCCATC TTCCTGCCCA CTGATGACAA GTGGGACTGG TITjCTGGCCA 1140 
AOACCTGGGT GCGCAATGCC GAGTTCTCCT TCCATGAGGC CCTCACGCAC CTGCTGCACT 1200 
CACATCTGCT GCCTGAGGTC TTCACCCTGG CTAOCCTGCG TCAGCIGCCC CACTGOCACC 1260 
CTCTCTTCAA GCTGCTOATC CCGCACACXX GATACACXXTT GCACATCAAC ACACTCGCCC 1320 
GGGAGCTCCT TATCGTCCCA GGGCAGGTGG TGGACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCnCTCTGA GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTGTGTCTGC 1440 
CTGAGGATAT CCGGACCCGA GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTGATGATG 1500 
GGATGCAGAT TTGGGGTGCA GTGQ AACGCT TTGTCTCTGA AATCATCGGT ATCTACTACC 1560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCTCCAGGC CTGGGTCAGA GAGATCTTCT 1620 
CCAAGGGCTT CCTAAAGCAG GAGAGCTCAG GTATCCCTTC CTCACTGGAG ACCCGGGAAG 1680 
CCCTGGTGCA GTATGTCACC ATGGTGATAT TCACCIGCIC AGOCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTGACTCC TGTGCTTGGA TGCCCAACCT GCCACCCAGC ATGCAGCTGC 1800 
CACCACCCAC CTCCAAAGGC CTGGCAACAT GCGAGGGCTT CATAGCCACC CTCCCACCTO 1860 
TCAATGCCAC ATGTGATGTC ATCCTTGCIC TCTGGTTGCT GAGCAAGGAG CCTGGAGACC 1920 
AAAGGCCCCT GGGCACCTAT CCGGATG AGC ACTTCACAGA GGAGGCCCCT CGGCGGAGCA 1980 
TCGCCACCTT CCAGAGOOGC CTGGCOCAGA TCTCG AGGGG CA1CCAGG AG CGGAACCGGG 2040 
GCCIGGTGCT GCCCTACACC TACCTAGACC CTCCCCTCAT OGAGAACAGC GTCTCCATCI 2100 
ACATCCCAGG GGAACACAGG CCCAGATGAC ATCOCTTTGA OCACATCGCT CTAGGATAAC 2160 
TGGCACCCAG AGAAAAGGAC TCCTCAGAAA AAACAGGCCC CCATGTGCCT CTCCTGGGAC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAGAAAA TCTACCAAGA ACAGAGTCTC AGGACAGAAC CACTGAGTCT 2340 
TTTGGAGGCT CCAAGCCTCA AAGTGCCCGC AGAGOCCACC TTGAGGGTTT TGCTAGTTGG 2400 
TTTTGTTTTG CGTTTACAGC CGTGGGGGGA AGCACATAAT CCCGCCCCAG GGCCCACTAG 2460 
CATCCACTGA TTGGACCTTA TGGTCACCCA ACTCAAGGAC AGCCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATGCCC ATAATCCCAG CACTTFGGGA GATGGAGGCG 2580 
GGAAAATCAT TTCAGGTCAG AAGTTCAAGG CCAGCCTGGA CGACATAGCG AGACTCCACC 2640 
TCTACCAAAA AATAAAAATT AAAAAACAAA AAAAAAAAAA AAAAA 



SEQ ID HO:134 PFH S Proleln sequence: 
Protein Accession I: HP_001132.1 

1 11 21 31 41 51 
I I I I I I 

MAEFRVRVST GEAPGAGTWD KVSVSIVGTR GESPPLPLDN LGKEFTAGAE EDFQVTLPED 60 
VGRVLLLRVH KAPPVLPLLG PLAPDAWFCR WFQLTPPRGG HLLFPCYQWL EGAGTLVLQE 120 
GTAKVSWADH HPVLQQQRQE ELQARQEMYQ WKAYNPGWPH CLDEKTVEDL ELNHCYSTAK 180 
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10 

15 



NANFYLQAQS AFAEMK1KGL LDRKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL DtRCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHOILSGIQT 300 
NVINGKPQFS AAPMTLLYQS PGCGPLLPIA IQLSQTPGFN SPIFLPTDDK WDWUAKTWV 360 
HNAEFSFHEA LTHLLHSHIX PEVFTLATLR QLPHCHPLFK LUPHTRYTL IHNTLARELL 420 
IVPGQWDRS TGIOIEGFSE UQRNMKQLN YSLLCLFEDI RTRGVEDIPG YYYRDDGMQI 480 / 
WQAVERFVSE UGIYYFSDE SVQDDRELQA WVREIFSKGF LNQESSG1PS SLETREALVQ 540 
YVTMVIFTCS AKHAAVSAGQ FDSCAWMFNL PPSMQUTIT SKGLATCEGF IATLPPVNAT 600 
CDVOALWLL SKEPGDQRPL GTYPDEHFTE EAPRRStATF QSRLAQISRG KJERNRGLVL 660 
PVTYLDPPU ENSVSI 



SEQ ID K&135 PFH4 ONA SEQUENCE 

Nudeic Add Accession <: NMJ02742 

Coding sequence: Z3&2974 (underlined sequences correspond to start and stop codora) 



1 11 21 31 41 SI 
I I I I I I 

GAATTCXnTCTCTCXntXTCCrCCKXXnTCTCXICG^ 60 

20 OCICCCGATC CTCATCCOCT TGCOCTCCCC CAGCCCAGGG ACTTTTOCGG AAAGTTTTTA 120 
111 rCOGTCT GGGCTCTCGG AGAAAGAAGC TCCTGGCTCA GCGGCTGCAA AACTTTCCTG 180 
CTGCCGCGCC GCCAGCCCCC GCCCTCCGCT GCOCGGCCCT GCGCCCCGCC GAGC QATG AG 240 
aKXXCTCCGGTCCTXKXKXCGaXVWjTCCCH^^ 300 
AGCGGCCGCC CCACIGGTCC CAGGGTOCOG GOCCGGGCCC GCGCCGTTCT TGGCTCCTGT 360 

25 CGCGGCCCCG GTCGGGGGCA TCTCGTTCCA TCTGCAGATC GGCCTGAGCC GTG AGCCGGT 420 
GCTGCTGCTO CAGGACTCGT COGGGGACTA CAGCCTGGOG CACGTCCGCG AG ATGCCTTG 480 
CTOCATTGTC GACCAGAAGT TCCCTGAATG TGG 1'ITCT A C GG AATGTATG ATAAGATCCT 540 
GCrmTOGC CATGAOCCTA CCTCTGAAAA CATCCTTCAG CTGGTGAAAG CGGCCAGTGA 600 
TATCCAGGAA GGCGATCTTA TTGAAGTGGT CT TGT CACGT TCCGCCACCT TTGAAGACTT 660 

30 TCAGATICGT CCCCACGCTC TCTTTGTTCA TTCATACAGA GCTCCAGCTT TCTGTGATCA 720 

CTGTGGAGAA ATGCTGTGGG GGCTGGTAGG TCAAGGTCTT AAATGTGAAG GGTGTGGTCT 780 
GAATTACCAT AAGAGATGTG CATTTAAAAT ACOCAACAAT TGCAGCGGTG TGAGGCGGAG 840 
AAGGCTCTCA AACGTTTCCC TCACTGGGGT CAGCACCATC CGCACATCAT CTGCTGAACT 900 
CTCTACAAGT GCCCCTGATG AGCCCCTTCT GCAAAAATCA CCATCAGAGT CGTTTATTGG 960 

35 TCGAGAGAAG AGGTCAAATT CTCAATCATA CATTGGACGA CCAATTCACC TTOACAAGAT 1020 
nTGATGTCT AAAGTTAAAG TGCCGCACAC ATTTGTCATC CACTCCTACA CCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAGA AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA 1140 
AGATIGCAG A TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
OGAAGTGACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 

40 AGAAGGGAGT GATGACAATG ATAGTGAAAG GAACAGTGGG CTCATGGATG ATATGGAAGA 1320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAG ACC ACGAGGACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCCCA CTCATGAGGG TAGTGCAGTC TGTCAAACAC ACG AAGAGGA AAAGCAGCAC 1500 
AGTCATGAAA GAAGGATGGA TGGTCCACTA CACCAGCAAG GACACGCTGC GGAAACGGCA 1560 

45 CTATTGGAGA TTGGATAGCA AATGTATTAC CCTCTTTCAG AATGACACAG GAAGCAGGTA 1620 
CTACAAGGAA ATTCCTTTAT CTGAAATnT GTCTCTGGAA CCAGTAAAAA CTXCAGCTTT 1680 
AATTCCTAAT GGGGCCAATC CICATTGTTT CG AAATCACT ACGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTCCAG CCCATCACCA AATAACAGTG TTCTCACCAO 1800 
TGGCGTTGGT GCAGATGTGG CCAGGATGTG GGAGATAGCC ATCCAGCATG CCCTTATGCC 1860 

50 CGTCATTCCC AAGGGCTCCT CCGTGGGTAC AGGAACXZAAC TTGCACAGAG ATATCTCTGT 1920 
GAGTATTTCA GTATCAAATT GCCAGATTCA AGAAAATGTG GACATCAGCA CAGTATATCA 1980 
GATTTTTCCT GATGAAGTAC TGGGTTCTGG ACAGTTTGGA ATTGTTTATO GAGGAAAACA 2040 
TCGTAAAACA GGAAGAG ATG TAGCTATTAA AATCATTGAC AAATTACGAT TTCCAACAAA 2100 
ACAAGAAAGC CAGCTTOGTA ATGAGGTTGC AATTCTACAG AACCTTCATC AOCCTGGTGT 2160 

55 TGTAAATTTG GAGTGTATGT TTGAGAOGCC TGAAAGAGTG TTIGTTGTTA TGGAAAAACT 2220 
CCATGGAG AC ATGCTGG AAA TG ATCTTGTC AAGTG AAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACGAAGTTT TTAATTACTC AG ATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
OGTICACTGT GACCTCAAAC CAGAA AATGT GTTGCTAGCC TCAGCTGATC CTTTTC C TCA 2400 
GGTGAAACTT TGTGATTTTG GTTTTGCCCG GATCATTGGA GAGAAGTCTT TCCGGAGGTC 2460 

60 AGTGGTGGGT ACCCCCGCTT ACCTGGCTCC TGAGGTCCTA AGGAACAAGG GCTACAATCG 2520 
CTCTCTAGAC ATGTGGTCTG TTGGGGTCAT CATCTATCTA AGCCTAAGCG GCACATTCOC 2580 
ATTTAATG AA GATGAAGACA TACACGACCA AATTCAGAAT GCAGCTTTCA TCTATCCACC 2640 
AAATCCCTGG AAGG AAATAT CTCATGAAGC CATTG ATCTT ATCAACAATT TGCTGCAAGT 2700 
AAAAATGAGA AAGCGCTACA GTGTGGATAA GACCTTGAGC CACCCTTGGC TACAGG ACTA 2760 

65 TCAGACCTGG TTAGATTTGC GAGAGCTGGA ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TG AAAGTG AT GACCTG AGGT GGGAGAAGTA TGCAGGCGAG CAGCGGCTGC AGTACCOCAC 2880 
ACACCTGATC AATCCAAGTO CTAGCCACAO TGACACTCCT GAGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTGAGCGTG TCAGCATCCT C3QAGTTCCA TCTCCTATAA TCTGTCAAAA 3000 
CACTGTGGAA CTAATAAATA CATACGOTCA GGTTTAACAT TTGCCTTGCA GAACTGCCAT 3060 

70 TATTTTCTGT CAGATGAGAA CAAAGCTGTT AAACT GTTAO CACTGTTGAT GTATCTGAGT 3 120 
TGCCAAGACA AATCAACAGA AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3180 
AAAGTTCCCT GAAACACGAA ACTTGTTATT GTGAATGATT C ATGTTATAT TTAATGCATT 3240 
AAACCTGTCT CCACTGTGCC TTTGCAAATC AGTGTTTTTC TTACTGG AGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TGAAGTAGTT CTGTTTGGTQ TGTCCCATTG GTGTTQTCAT 3360 

75 TGTAAACAAA CTCTTGAAG A GTCGATTATT TCCAGTGTTC TATG AACAAC TCCAAAACCC 3420 

ATGTGGGAAA AAAATGAATO AGGAGGGTAG GGAATAAAAT CCTAAG ACAC AAATGCATGA 3480 
ACAAGTTTTA ATGTATAGTT TTGAATCCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAOACA ATGCAOCTAG CTGTGCAAGA CCTAGTGCTC TTAAGCCTAA ATGCCTTAGA 3600 
AATGTAAACT GCCATATATA ACAGATACAT TTCCCTCTTT CTTATAATAC TCTGTTGTAC 3660 
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TATGGAAAAT CAGCTGCTCA GCAAOCTTTC AOCTTTC IGT. ATTTTTCAAT AATAAAAAAT 3720 
ATTCTIGTCA AAAAAAAAAA AA 



SEQ ID HO:13S PFH4 Protein sequence 
Protein Accession I: NP.002733.1 



I 11 21 31 41 31 

I I I I I I , 

MSAPPVLRFP SPLLPVAAAA AAAAAALVPC3 SGPGPAPFIA PVAAPVGGIS FHLQIGLSRB 60 
PVLLLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK nURHDPTS ENOjQLVKAA 120 
SDIQEGDUE WLSRS ATFB DFQKPHALF VHSYRAPAFC DHCGEMLWGL VRQGLKCEGC 180 
GLKYHKRCAF KIFNNCSGVR RRRLSNVSLT GVSTIRTSSA ELSTSAPDEP LLQKSPSESF 240 
USBBKRSSSQ S YIGRPIHLD KILMSKVKVP HTFVIHSYTR PTVOQYCKHL LKGLFRQGLQ 300 
CKDCRFNCHK RCAPKVFNNC LGEVTINGDL LSPGAESDW MEEGSDDNDS ERNSGLMDDM 360 
EEAMVQDAEM AMAECQNDSG EMQDFDPDHE DANRTISFST SNNIPLMR W QS VKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWRLOSKC ITLFQNDTGS RYYKHPLSE ILStEPVKTS 480 
AUPNGANPH CFETTTANW YYVGENWNP SSPSFNNSVL TSGVGADVAR MWEIAIQHAL 540 
MPVIPKGSSV GTGTNLHRDI SVSISVSNCQ IQENVDISTV YQIFPDEVLG SGQFGIVYGG £00 
KHRKTGRDVA IKIIDKLRFP TKQESQLRNE VAILQNIHHP GWNLECMFE TPERVFWME 660 
KLHGDMLEMI LSSEKGRLPE HITKFUTQI LVALRHLHFK NIVHCDLKFE NVLLAS ADFF 720 , 
PQVKLCDFGF ARIIGEKSFR RS WGTPAYL APEVLRNKGY NRSLDMWS VG VTJYVSLSGT 780 
FPFNEDEDIH DQIQNAAFMY PPNPWKHSH EAIDLINNLL QVKMRKRYSV DKTLSHPWLQ 840 
DYQTWLDLRE LECK1GERY1 THESDDLRWE KYAGEQRLQY PTHLINPSAS HSDTPETEET 900 
EMKALGERVSIL 



SEQ ID NW37PFH30NA SEQUENCE 

Nudcte Add Accession!: X95425 

Coding sequence: 712-3825 (imdetSned sequences correspond to start and stop colons) 

1 II 21 31 41 31 
I I I I I I 

AATGGTCAGT CAATACATTA TAACATAATA CACCAAATGC TAGAATAGAA GGGGAGGGGG 60 
GCACACATAA TGACTCACTG CTGGAAG AAG GGTGCATCAG TGAATT AAAA AATGTCCCTC 120 
CCCTCTTCAG CACTCAGCGC GCAGCTATTT CCTTCTGCCA GT C1C1 1 1G A ACTCTGGATC 180 
TITCCTTTT O CTCGCTGCTC TCCTGTTTTT CATTCTCCAC ATTTTCTCAA TCCTCTTTCT 240 
TTATOCTTAG CCACCCTGCT Ti l l 1U C IOC U 1 1 1 1 AAAA AATCGGAGAT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGGCTGCAAA GAACTTCACC 360 
TTTCCCCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTGA AAGGCAAAGG 420 
AACAGGACCC AGACCCTCTC GACACCCTTQ ATCCGAGTCA GATCTGCACT AGCAACCAGA 480 
ACTAATATTT CATTTAACCC ACCAAAAGGG GGAGGCGAGA GG AGCCAG AA CCAAACITCA 340 
TCFGTCTCAG ACGGATCCGT GGTTOCTACA TTTGGAGGAG CCGCGTGTCA GAAGGCGTAG 600 
GACCOCAAGG GGGGACAAGG AGG ACTCCCG AGTCTCCCTT CTCCGCICTC CGAGACCGAA 660 
GAGGTGGACT GAGCCGCTCG GGACAGCGGC ACCGGAGGAG GCTCGGAQAA GATG CGGGGC 720 
TOOGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCG A CAOCCCCATC 780 
AOCCCAGCGT CCCTGGCCGG CTGCTACTCT GCACCTCGAC GGGCTCCCCT CTGG ACGTGC 840 
CITCTCCTGT GCGCCGCACT CCGGACCCTC CTGGCCAGCC CCAGCAAGGA AGTGAATTTA 900 
TTGQATTCAC GCACTGTCAT GGGGOACCTG CGATGGATTG CTTTTCCAAA AAATGGGTGG 960 
GAAGAGATTG GTG AAGTGG A TG AAAATTAT GCCCCTATCC ACACATACCA AGTATGCAAA 1020 
GTGATGGAAC AGAATCAGAA TAACTGGCTT TTGACCAGTT GGATCTCCAA TGAAGGTGCT 1080 
TCCAGAATCT TCATAO AACT CAAATTTACC CTGCGGGACT GCAACAGCCT TOCTGGAGGA 1140 
CTGGGO ACCT GTAAGG AAAC CTTTAATATG TATT ACTTTG AGTCAGATGA TCAGAATGGG 1200 
AOAAACATCA AGG AAAACCA ATACATCAAA ATIGATACCA TTGCTGCCGA TG AAAGCTTT 1260 
ACAGAACTTG ATCTTGGTG A CCGTGTTATG AAACTGAATA CAGAGGTCAG AGATGTAGGA 1320 
CCTCTAAGCA AAAAGGGATT TTATCTTGCT TTTCAAGATG TTGGTGCTTG CATTGCTCTG 1380 
GTTTCTGTGC GTGTATACTA TAAAAAATGC CCTTCIGTOG TACGACACTT GGCTGTCTTC 1440 
CCTGACACCA TCACTGGAGC TGATTCTTCC CAATTGCTCG AAGTGTCAGG CTCCIGTGTC 1500 
AAGCATTCTG TGACCGATGA ACCTCCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GGAAATGCAT GTGCAAGGCA GGATATGAAG AGAAAAATGG CACCTGTCAA 1620 
GTGTGCAGAC CTGGGTTCTT CAAAGCCTCA CCTCACATCC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATG A GGAAGCTTCA ACCTCTTGTG TCTGTGAAAA GGATTATTTC 1740 
AGGAGAGAGT CIOATCCACC CACAATGGCA TGCACAAG AC CCCCCTCTGC TCCTCGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGGAAT GGATTCCGCC TGCTGACACT 1860 
GGTGGAAGG A AAG ACGTGTC ATATTATATT GCATGCAAGA AGTGCAACTC CCATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGG TCATGTCAGG TACCTTCCCC GGCAAAGCGG CCTGAAAAAC 1980 
ACCTCTGTCA TGATGGTGGA TCTACTOGCT CACACAAACT ATACCTTIGA GATTGAGGCA 2040 
GTGAATGGAG TGTCCGACTT GAGCOCAGGA GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAAAT TGCAAAAAAC 2160 
AGCATCTCTT TGTCTIGGCA AGAACCAG AT CGTCCCAATG GAATCATCCT AGAGTATGAA 2220 
ATCAAGCATT TTGAAAAGGA CCAAGAGACC AGCTACACGA TTATCAAATC TAAAGAGACA 2280 
ACTATTACTG CAG AGGGCTT GAAACCAGCT TCAOTTTATG TCTTCCAAAT TCG AGCACGT 2340 
ACAGCAGCAG GCTATGGTJGT C7TCAGTCGA AGATTTGAGT TTGAAACCAC CCCAGTGTTT 2400 
GCAGCATCCA GCGATCAAAG CCAGATTCCT GTAATTGCTG TGTCTGTO AC AGTAGGAGTC 2460 
A TTTTGTTG G CAGTGGTTAT CGGCGTCCTC CTCAGTGGAA GTTGCTGCGA ATGTGGCTGT 2520 
GGGAGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATCC TAATATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAGATCC AGAAGAGGAA AAGATGCATT TTCATAATGG GCACATTAAA 2640 
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CTOCCAGG AG TAAGAACTTA CATTQATCCA CATACCTATG AGGATCCCAA TCAAQCTGTC 2700 
CAOGAATTIO CCAAGCAOAT AOAAGCATCA TGTATCACCA TFGAGAGAGT TATTOGAGCA 2760 
GGTGAATTTG GTG AAGTtTG TAGTGO ACGT TTCAAACTAC CAGG AAAAAG AGAATTACCT 2820 
GTGGCTATCA AAACCCTTAA AOTAGGCTAT ACTGAAAAGC AACGCAGAGA TTTCCTAGGT 2880 
GAACCAAOTA TCATGGGACA GTTTGATCAT CCTAACATCA TCCATTTAGA AGGTGTGGTG 2940 
AGCAAAAOTA AACCAGTG AT GATCGTGACA GAGTATATGO AG AATGGCTC TTTAGATACA 3000 
TTTTTGAAGA AAAACGATGG GCAGTTCACT GTG ATTCAGC TTGTTGGCAT GCTGAGAGGT 3060 
ATCTCTGCAG GAATGAAGTA CCTTTCTGAC ATGGGCTATG TGCATAGAGA TCTTGCTGCC 3120 
AGAAACATCT TAATCAACAG TAACCTTGTG TGCAAAGTGT CTGACTTTGG ACTTTCCCGG 3180 
GTACTGG AAG ATOATCCCGA GGCAGCCTAC AOCACAAGGG GAGGAAAAAT TCCAATCAGA 3240 
TGGACTGCCC CAGAAGCAAT AGCTTTCOG A AAGTTTACTT CTGCCAGTGA TGTCTGGAGT 3300 
TATGGAATAG TAATGTGGGA AOTTGTGTCT TATGGAGAGA GACCCTACTG GGAGATGAOC 3360 
AATCAAGATG TGATTAAAGC GGTAGAGGAA GGCTATCGTC TGCCAAGCCC CATGGATTGT 3420 
CCIGCTGCTC TCTATCAGTT AATGCTGG AT TGCTGGCAG A AAOAGCGAAA TAGCAGGCCC 3480 
AAGTTTGATO AAATAGTCAA CATGTTGGAC AAGCTGATAC GTAACCCAAG TAG1CTQAAG 3540 
AOGCTGGTTA ATGCATCCTG CAGAGTATCT AATTTATTGG CAGAACATAG OOCACTAGGA 3600 
TCTGGGGCCT ACAGATCAGT AGGTG AATGG CTAG AGGCAA TCAAGATGGG CCGGTATACA 3660 
GAGATTTTCA TGGAAAATGG ATACAGTTCA ATGOACGCTG TGGCICAGGT GACCTTGGAG 3720 
GATTTGAGAC GGCTTGGAGT GACICTTGTC GG1CAGCAGA AGAAGATCAT G AACAGCCTT 3780 
CAAGAAATGA AGGTGCAGCT GGTAAACGGA ATGGTGOCATTGTAACTICA TGTAAATGTC 3840 
GCTTCTTCAA GTG AATGATT CTGCACTTTG TAAACAGCAC TGAGATTTAT TTTAACAAAA 3900 
AAA 



SEQ10 K0-.I38 PfH3 Protein sequence: 
Protein Accession t: CAA647O0.1 



1 11 21 31 41 51 
I I I I I I 

MRGSGPRGAG HRRPPSGGCD TPITPASLAG CYSAPRRAPL WTCLLLCAAL RTLLASPSNE 60 
VNLLDSRTVM GDLGWIAFPK NGWEEIGEVD ENYAPMTYQ VCKVMEQNQN NWU.TSWISN 120 
BGASRfflEL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNKENQ YKIDTIAAD 180 
ESFTELDLGD RVMKLNTEVR DVGPLSKKGF YLAFQDVGAC IALVSVRVYY KKCPSWRHL 240 
AVFPDTITGA DSSQLLEVSQ SCVNHS VTDE PPKMHCSAEG EWLVPIGKCM CKAGYEEKNG 300 
TOQVCRPGEF KASPHIQSCG KCPPHSYTHE EASTSCVCEK DYFRRESDPP TMACTRPPSA 360 
PRNAENVNE TSVFLEWIPP ADTGGRKOVS YYIACKXCNS HAGVCEECGG HVRYLPRQSG 420 
LKNTS VMMVD LLAHTNYTFE IEA VNGVSOL SPG ARQYVSV NVTINQAAPS PVTNVKKGKI 480 
AKNSISLSWQ EPDRPNGUL EYEIKHFEKD QETSYTUKS KETTTTAEGL KPASVYVFQI 540 
RARTAAGYGV FSRRFEFETT PVFAASSDQS QIPVIAVS VT VGVH1AWI GVLLSGSCCE 600 
OGOGRASSLC AVAHPHJWR CGYSKAKQDP EEEKMHFHNG KKLPGVRTY IDPHTVEDPN 660 
QAVHEFAKEI EASCITTERV IGAGEPGEVC SGRIJCLPGKR ELPVAKTLK VGYTEKQRRD 720 
FLGEASIMGQ HJHPNHHLE G WTKSKPVM IVTEYMENGS LDTF1KKNDG QFTVTQLVGM 780 
LRGISAGMKY LSDMGYVHED LAARNHJNS NLVCKVSDFG LSRVLEDDPE AAYTTRGGKI 840 
PHWTAPEAI AFRKFTSASD VWSYGIVMWE WSYGERPYW EMTNQOVDCA VEEGYRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDETVN MLDKLIRNPS SUCTLVNASC RVSNLLAEHS 960 
PLGSGAYRSV GEWLEA1KMG RYTEIFMENG YSSMDAVAQV TLEDLRRLGV TLVGHQKKIM 1020 
NSLQEMKVQL VNGMVPL 



SEQ ID NO-.139 PFH2 DMA SEQUENCE 

Nudetc Add Accession f: NM.01G029 

Codtag sequence: 78-1097 (undertinad sequences comspond to slat and slop codons) 

1 11 21 31 41 51 

I I I I I I 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC COT CTTCI P C CCCCCGAGCT 60 
GGGCGTGCGC GGCGGCAATg AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 
TGCTGCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TQACGGOGAC CTGACGCTAC 180 
TATGGGOGGA GTGGCAGGGA CGACGCGCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 
TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 
11 1C1C1 1UT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCX 360 
TAGAGAATGG CAATTTAAAA GAAAAAGATA TAC11U11TT GCCGCTTGAC CTGACCGACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTCTGCAT GGATACCAGC TTGGATGTCT 540 
ACAGAAAGCT AATAGAGCTT AACTACTTAG GGAOGGTGTC CITGACAAAA TGTGTICrGC 600 
CTCACATG AT COAGAGG AAG CAAGG AAAG A TTGTTACTOT GAATAGCATC CTGGGTATCA 660 
TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTrTTTA 720 
ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 
GACCTGTGCA ATCAAATATT GTGGAGAATT COCTAGCTOG AGAAGTCACA AAG ACTATAG 840 
GCAATAATOG AGACCAGTCC CACAAGATGA CAACCAOTCO TTOTGT G CGG CTGATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 
CATATTTGTG GCA ATACA TG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAOA 1020 
AAAGGATTGA GAACTTTAAG AGTGGTGTGO ATGCAQ ACTC TTCTTATTTT AAAATCTTTA 1080 
AGACAAAACA TGACXQAAAA GAGCACCTOT ACTTTTCAAG GCACTGGAGG GAGAAATGGA 1140 
AAACATGAAA ACAGCAATCT TCTTATGCTT CTG AATAATC AAAGACTAAT TTGTGATTTT 1200 
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ACnnTAAT AGATATGACT TTGCTTOCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGOCAT G AATCTTGCA AA 



SEQ m NftWO PFHZProWn sequence 
Protein Accession!: NP_0571iai 

1 11 21 31 41 51 
I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTO ASS 60 
GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVUQEFG RIDILVNNGO MSQRSLCMDT SLDVYRKLIE LNYLOTVSLT KCVLPHMER 180 
KQGKTVTVNS EGIISVPLS IGYCASKHAL RGFFNGLRTB LATYPGnVS NICPGPVQSN 240 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FIXVTYLWQY 300 
MPTWAWWITN KMGKKRIENF KSGVDADSSY FKEFKTKHD 



SEQ ID N0:141 PFHt ONA SEQUENCE 

Nudge Add Accession I: NMJB1614 

Coding sequence: M740 (undertned sequences conespond to start and stopcodons) 

1 11 21 31 41 51 
I I I I I I 

AESAGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACIT GAGCGCGTOC 60 
GGCCGGAACC TGCACGAGAT ggactcagag GCGCAGCCCC TGCAGCOXC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCCGTCT GCAGCCGCTG CCGCCGCCGC CGCTGTTTCG 180 
TCCTCAGCOC CCGAGATGGT GGTGTCTAAG OCOGAGCACA ACAACTCCAA CAAOCTGGOG 240 
CTCTATGGAA CCGGCGGCGG AGGCAGCACT GGAGGAGGCG GOGGCGGTGG CGGGAGCGGG 300 
CACGGCAGCA GCAGTGGCAC CAAGTCCAGC AAAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGGCACC GGCGCGCCCT GTTCGAAAAG CGCAAOCGGC TCAGOQ ACTA CGOGCTCATC 420 
TICGGCATGT TCGGCATCGT GGTCATGGTC ATCGAGACCG AGCTGTCGTG GGGCGCCTAC 480 
GACAAGGCGT CGCTGTATTC CTTAGCTCTG AAATGOCTTA TCAGTCICIC CACGATCATC 540 
CTGCTCGGTC TGATCATCGT GTACCACGOC AGGGAAATAC AGTTGTTCAT GGTGGACAAT 600 
GGAGCAG ATG ACTGG AG AAT AGGCATGACT TATG AGOGTA TTTTCTTCAT CTGCTTGGAA 660 
ATACTGGTGT GTGCTATTCA TCCCATACCT GGGAATTATA CATTCACATG GACGGCCCGG 720 
CTTGCCTTCT CCTATGCCOC ATCCACAACC ACCGCTGATG TGG ATATTAT TTTATCTATA 780 
CCAATGTTCr TAAGACTCTA TCTGATTGCC AGAGTCATGC TTTTACATAG CAAAC1 1 1 1C 840 
ACTGATGOCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACOTTTT 900 
GTTATGAAGA CTTTAATGAC TATATGGCCA GG AACTGTAC TCTTGGTTTT TAGTATCTCA 960 
TTATGGATAA TTGCCGCATG GACTGTCCGA GCTTGTGAAA GGTACCATGA TCAACAGGAT 1020 
GTTACTAGCA AC rT C C TTG G AGCGATGTGG TCOATATCAA TAA CTTTIC T CTCCATTGGT 1080 
TATGGTGACA TGGTACCTAA CACATACTGT GG AAAAGGAG TCTGCTTACT TACTGGAATT 1140 
ATGGGTGCTG GTTGCACAGC OCTGGTGGTA GCTGTAGTGG CAAGGAAGCT AGAACTTACC 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGG ATACTC AGCIGACTAA AAG AGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGGAAACA TGGCTAATTT ACAAAAATAC AAAGCTAGTG 1320 
AAAAAGATAG ATCATGCAAA AGTAAGAAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTQAATG AOCAAGCAAA CACTTTGGTG 1440 
GACTTGGCAA AGACCCAGAA CATCATGTAT GATATGATTT CTGACTTAAA CGAAAGGAGT 1500 
GAAG ACTTCG AG AAG AGG AT TGTTACCCTG GAAACAAAAC TAG AGACTTT GA1TGGTAGC 1560 
ATCCAOGCCC TCCCTGGGCT CATAAGCCAG ACCATCAGGC AGCAGCAGAG AGATTTCATT 1620 
GAGGCTCAG A TGGAG AGCTA CGACAAGCAC GTCACTTACA ATGCTOAGCG GTCCCGGTCC 1680 
TCGTCCAGGA GGCGGCGGTC CTCTTCCACA GCACCACCAA CTTCATCAGA GAGTAGCTAG 



SEP 10 NO:142 PFH1 Proleln seouence: 
Protein Accession*: NP.067627 

1 11 21 31 41 51 
I I I I I I 

MSSCRYNGGV MRPLSNLS AS RRNLHEMDSE AQPLQPPAS V GGGGGASSPS AAAAAAAAVS 60 
SSAPEIWSK PEHNNSNNLA LYOTGGGGST GGGGGGGGSG HGSSSGTKSS KKKNQNIGYK 120 
LGHRRALFEK RKRLSDYALI FGMFGIWMV ETELSWGAY DKASLYSLAL KCUSLSTH 180 
LLGLUVYHA REQLFMVDN GADDWRIAMT YERIFFICLE ILVCAIHPIP GNYTFTWTAR 240 
LAFSYAPSTT TADVDIILSI PMFLRLYUA RVMLLHSKLF TDASSRSIGA LNKINFNTRF 300 
VMKTLMTICP GTVLLVFSIS LWUAAWTVR ACERYHDQQD VTSNFLGAMW LISITFLSIG 360 
YGDMVFNTYC GKG VCLLTGI MG AGCTALVV A WARKLELT KAEKHVHNFM MDTQLTKRVK 420 
NAAANVLRET WUYKNTKLV KKIDHAKVRK HQRKFLQAW QLRS VKMEQR KLNDQANTLV 480 
DLAKTQNIMY OKOSDLNERS EDFEKRIVTL ETKLETUGS IHALPGUSQ TIRQQQRDFI 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



SEQ ID N0:143 PFG9 ONA SEQUENCE 
NudelcAcMAccesslonf: AL1 10139, coctng rejlon Ij FGENESH predcted 
Cntng sequence: 1-1896 (unlerflned sequences coiespert to 

1 11 21 31 41 51 
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43ScaaiccGTGaxxnGcccGccccGCTC a 

GCTCCCGCOO CCCGCGCCAO CAGAGCCGAG TCCGTCTOCO CXJCCGTGGCC CGAACCCOAG 120 
CGCGAGTOGC GGCCACCGCC CGGCOCGGGG CCCGGGAACA CCACCCGOTT TGGGTCTGGG 180 
GCGGCGGGCG GCAGCGGCAG CTCCAGCICC AACAGCAGTC GCGACGOCIT GGTGACCCGC 240 
ATTTCCATCC TCCTCCGCGA CCTACCCACC CTCAAOGCAG CCGTGATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGCCTGCTG CTGOGOGTCT TCAGOTCGGG AAAG AGGTTA 360 
AAGAAG ACAC GCAAGTATGA TATCATCACC ACTCCAGCAG AGCGAGTGGA AATGGCGCCA 420 
CTAAATGAAG AGGATGATGA AGATGAGGAC TCCACAGTAT TCOACATCAA ATACAGAGTG 480 
TCCTTGCCGG CTGCACTGAG ACGTCAGCTG CCAGGGTGCC AGACGCTACT GACAGTTCCT 540 
GTGCCCCCAC CCTTCATCCT CGACATTQAC CTTCCAGCAA OATGCAGTGG AAGGCCTOAT 600 
GOTGGAATCA GACCTGGTAA AACCTGTTTC CCAGCCTGGT GGCATCCTOT GGAAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGGACTGO AOCTGGAAGC CCTCTTGCGT CGGAGGTGTT 120 
GAAACCAAAA CG AACGTTAT GTATAAAACC CCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 
TCAGACTGTC ACTGGCAAGC TCGTTTCCAC GTCACCACAA TGQAGTTGCT TCTGCCACCC 840 
TTTQGGCATC CCTTTAAAGT GCOCCCTACT TCTACTCCCC ATGGTTTTCG ACAACIGCAG 900 
CTGAATCTCA TGGAAAAGCT GGATTCCTCT GCCTTACGCA G AAACACCCG GGCTOCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAOAA ATGGOGGCTG CTGAAAGTG A CCTTCCAAAT 1020 
OC11 U O 1UGC ACTTCAGCGC CACAOGCTCT OCAATAAAAA CCCTTTACAC ACAAACCATO 1080 
AQTACCTTQG GCTTOOATGI TTTCTGTGGT GCCGGOCAGC GGGGCACCTT TTGTG AAG AC 1140 
AQAGCAGTG A CTAAGGTTCT OCAGGGTAGC TCTTTCTGCA AACAGCTGCG CTGGAAGOCA 1200 
GCCCTAGAGA GTGGGTTTCC CCATCATCTC AGGCTTCTCA GAGAGTGTCC TCCGCTGAGC 1260 
ACCCATCCTG TCAGGTTGGC TCGTrCAGATGCCCGGGGAC AAGCCAGCCT GACGGGOAGG 1320 
AGGGTGTFTC GGCGTCOGCG GCAOTCTCTG CATGGOGOAG GGTCAGCGGG TACCGCAACT 1380 
TGCCTTTTGG TTTTGAAG AT TCTGTTGAGG CGCCATCCTC ACCTTGACCT CTICTACAAA 1440 
ATCTGTCTCC CCTGCTGTGC CGTGGAACAC CTACGGGAAG CCAAGAGAAG CTCAGTGACT 1500 
GTCCTTGCGT CATTTGAGCA GAGGCCACAA AAGGCAGCTG CTOCCCACGG GGAGCCTGTC 1560 
AAACGAGGGC CCAGTGGGCA ATTGACCAGA CACACATGCC CTGGCTGGGG GATCACACAT 1620 
GCQAACCTGC AGACAATTCC AGATACCCAA GGCCAGGAAG GCCCACGTGA GGATGTCACT 1680 
CACCCTGG AG GAGACTTGGA TGGGGTGGCA AATTTCTATT TGG AGGAAGA GGGTTTCCAG 1740 
GATGGCAGAT GOCAGAAGAT GGTGCTGATG TCTGAGGAAG GGGCAOCTAG TTTGACAGGA 1800 
TGTGAGAGGC TCACAGGTTC CCATCACTTC TCCAGCCATT CCAAGTCTTG GTOCTTCCTT 1860 
TCCCCCCGAC AGCCCCTGTT TCTGTCCAGG CCCJjSA 



SEP, to NO:144 PFG9 Protein seouence: 

Protein Accession t: none available, FGBESH predicted 

1 11 21 31 41 SI 
I I I I I I 

MRAVPLPAPL LPLLLLALLA APAARASRAE SVSAPWPEPE RESRPPPGPG PGNTTRFGSG 60 
AAGGSGSSSS NSSGDALVTR ISILLRDLPT UCAAVIVAFA FTTLLIACLL LRVFRSGKRL 120 
KKTRKYDIIT TPAERVEMAP LNEEDDEDED STVFDIKYRV SLPAALRRQL PGCQTLLTVP 180 
VPPPFILDID LPARCSGRPD GGIRPGKTCF PAWWHPVESW S AATWG VKDW TWKPSCVGGV 240 
ETKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQLQ 300 
LNLMEKLDSS ALRRNTRAPS ARCLPLVLAE MAAAESDUN FWWHFS ATGS PIKTLYTQTM 360 
STLGLDVFCG AGQRGTFCED RAVTKVLQGS SF5KQLRWKP ALESGFPHHLRLLRECPPLS 420 
THPVRLARSD ARGQASLTGR RVFRRPRQSL HGGGS AGTAT CLLVLKILLR RHPHLDLFYK 480 
ICLPCCAVEM LREAKRSSVT VLASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWGtTH 540 
ANLQTIPDTQ GQEGPREDVT HPGGDLDG VA NFYLEEEGFQ DGRCQKMVLM SEEGPPSLTG 600 
CERLTGSHHFSSHSKSWSFLSPRQPLFLSR P 



SEO ID NO°.145PFQ60NA SEQUENCE 

Nude>c Add Accession t: NM.013427 

Coding sequence: 875r3799(umlenTt^a«j«encesettnesoond to start and stop codons) 

1 II 21 31 41 51 
I I I I I I 

GGCTGGGCTG GGAATAGCGT GTTGCTCTCC GGOGG AACAC ACACACCCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCGCTCCT CCACXKjAG AG CGCTGAGCGC CGCCGGGAAT TCCATCCCAC 120 
CGTGGGCACG CA G P C TT TGG AGGTCCCGGG CGCAGCACGC TCGOTGTCCC CACACTGCAO 180 
CAAGACAGAG ACCCCGCGGG AACCTTGAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAGAGTGGGC GCAGCAGCCC AGCGGAGGCC AGGCGCGCAA CCTCGGGCGC CGGGGCAAGG 300 
AGAGAGTGCA GGGAGGOGCA GCTCAGGGGC CCGGCTCAGG AGCGGGAGGA AGTTCTCGCG 360 
GCGCCGGGAG CGCGGTGGAC GCGCCCTGGO CGCACGCXrA GGCAGCCTTC TCCCTGGCCC 420 
TCGOG ACTGT CCTCGGGCGC CAAGGAGO AG CTTOCTGGAG TCTTAGAGGC CATCCAGAGC 480 
CAGCGAGCAG GAGCGCTGGG TCTCCCGCCT CAGCTAGGAA GGGGGAGTGG CGCTGGCAGG 540 
CTGGAGCTGG GAACCCAGOG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCCTCTTCGC 600 
GTCTTGGGCT CCGGAGGAAG GTTCTAGCGG CTGCAGGAGG TCCOCAGACC CATTTTCCTA 660 
G AAGGCTGGT GATGG ATCTG CTGCTCCTGC CGCCGCCGGG GCACTTGG AG CGCACCGGCG 720 
GCGCGTGAGC TGGGCTTTGC TCTCCACCGC CCTGGGCAAA CCCCGGGCCA GCCCCGCCTG 780 
GCACCTTTGC CTGAGTCOCT TTCGGTTCCC GACCCAAAGC CACCAGCGTC CAGGG AGGGA 840 
GGAGGAGGTG GTCCTCAGGT GCAGCCCCGC CGAGATjffrCC GCGCAG AGCC TGCTCCACAG 900 
CGTCTTCTCC TGTTCCTCGC CCGCTTCAAG TAGGGGGGCC TCGGCCAAGG GCTTCTCCAA 960 
GAGGAAGCTG CGCCAG ACCC GCAGCCTGGA CCCGGCCCTG ATCGGCGGCT GCGGG AGCG A 1020 
CGAGGCGGGC GCGGAGGGCA GTGCGCGGGG AGCCACGGCG GGCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC OAG AOTCTCG GCCCTOGCTT GGCGTCCTCT TCCCOGGGTC OGCCCCCCAG 1140 
GGCCACCAGG CTACCGCCTC CTGGACCTCT TTGCTCGTCC TTCTCCACAC CCAGCACCCC 1200 
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GCAGGAGAAG TCACCATCCG GCAGCTTTCA CTTTQACTAT GAGGTTCCCC TGGGTCGOGG 1260 
CGGCCTCAAG AAGAGCATGO CCTGGGACCT GCCTTCTOTC CTGGOCGGGC CAGCCAGTAQ 1320 
CCGAAGCGCT TOCAGCATCC TCTGTTCATC OGGGGGAGGC CCCAATGGCA TCTTOGCTTC 1380 
TCCTAGGAGG TGGCTCCAGC AGAGGAAGTT OCAOTOCCCA CCCGACAGTC GCGGGCACCC 1440 
CTACOTCGTG TGGAAATCOG AGGGTGATTT CACCTGGAAC AGCATGTCAG GCCGCAGTGT 1500 
GCGGCTOAGG TCAGTCCOCA TCCAGAGTCT CTCAOAGCn} GAGAGGGCOC GGCTGCAGGA 1560 
AG1GCCTTTT TATCAGTTGC AACAGGACTG TGACCTGAGC TGTCAGATCA CCATTCCCAA 1620 
AGATCGACAA AAGAG AAAGA AATCTTTAAG AAAGAAACTG GATTCACTAG GAAAGGAGAA 1680 
AAACAAAGAC AAAGAATTCA TGOCACAGGC ATTTGGAATG COCTTATCOC AAGTCATTGC 1740 
GAATGACAGG GOCTATAAAC TCAAGCAOGA CTTGCAGAGG GACGAGCAGA AAGATGCATC 1800 
TGACTTTCTQ GCTTCCCTOC TCOCATTTGG AAATAAAAGA CAAAACAAAG AACTCTCAAG I860 
CAGTAACTCA TCTCTCAGCT CAACCTCAQA AACACCGAAT GAGTCAACGT OCCCAAACAC 1920 
CGOGOAAGGG GCICCTCGGG CTAGGAGG AG GGGTGCCATO TCAGTGG ATT CTATCAOCGA 1980 
TCTTGATGAC AATCAOTCTC GACTACTAGA AGCTTTACAA CTTTOCTTGC CTGCTOAGGC 2040 
TCAAAGTAAA AAGG AAAAAG CCAGAG ATAA G AAACTCAGT CTGAATCCTA TTTACAGACA 2100 
GGTCCCTAGG CTGGTGGACA GCTGCTGTCA GCACCTAGAA AAACATGQCC TCCAQACAGT 2160 
GGGGATATTC CGAGTTGGAA GCTCAAAAAA GAGAGTGAGA CAATTACGTG AGGAATTTGA 2220 
OCGTGGGATT GATOTCTCTC TGGAGGAGGA GCACAGTGTT CATGATGTGG CAGCCT TGCT 2280 
GAAAGAGTTC CTGAGGGACA TGOCAGACCC CCTTCTCACC AGGGAGCTGT ACACAGCTTT 2340 
CATCAACACT CTCTTGTTGG AGCCGGAGGA ACAGCTGGGC ACCTTGCAGC TCCTCATATA 2400 
CCTTCTACCT CCCTGCAACT GCGACACCCT CCAOCGCCIG CTACAGTTCC TCTCCATCGT 2460 
GGCCAGGCAT GCCGATGACA ACATCAGCAA AGATGGGCAA GAGGTCACTO GGAATAAAAT 2320 
GACATCTCTA AACTTAGOCA CCATATTTGG ACCCAACCTG CTGCACAAGC AGAAGTCATC 2580 
AGACAAAGAATTCTCAGTTC AGAGTTCAGC OCGGGCTOAG GAGAGCAOGG CCATCATCGC 2640 
TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTCCCC CAGATCTCCA 2700 
GAAGOAAGTG CTGATCAGCC TGTTAGAOAC OGAT0CTGAT GTCGTGGACT A TTTAC TCAG 2760 
AAGAAAGGCT TCOCAATCAT CAAGCCCTG A CATGCIGCAG TOGGAAGTTT CCTTTTOCGT 2820 
GGGAGGGAGG CATTCATCTA CAG ACTCCAA CAAGGCCTCC AGCGGAGACA TCTCGCCTTA 2880 
TGACAACAAC TCCCCAOTGC TGTCTGAGCG CTCCCTGCTG GCTATGCAAG AGGACGCGGC 2940 
• CCCGGGGGGC TCGGAGAAGC TTTACAG AGT GCCAGGGCAG TTTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA AAGTCAAGGG AAAGTTCTCC TGGACCAAGO CTTGGGAAAG ATCTGTCAG A 3060 
GGAGCCTTTC GATATCTGGG GAACTFGGCA TTCAACATTA AAAAGCGGAT CCAAAGAOCC 3120 
AGGAATGACA GGTTCCTCTG GAGACATTTT TGAAAGCAGC TOCCTAAGAG CGGGGCCCTG 3180 
CTCCCTTTCT CAAGGGAAOC TGTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAGA 3240 
GCTGGACAGC GACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCOOCG CGACGGAGGG 3300 
CAGGGCOCAC CCTGCGGTOT CGCGCGCCTG CAGCACGCCC CACGTCCAGG TGGCAGGGAA 3360 
AGCCGAGCGG CCCACGGCCA GGTCGGAGCA GTACTTGACC CTGAGCGGCG CCCACG ACCT 3420 
CAGOGAGAGT GAGCTGGATG TGGCCGGGCT GCAGAGCCGG GGCACACCTC AGTGCCAAAG 34S0 
ACCCCATGGG AGTGGGAGGG ATGACAAGCG GCCCCCGCCT CCATACCCGG GCCCAGGGAA 3540 
GOCOGCGGCA GCGGCAGCCT GGATCCAGGG GCCCCCGGAA GGCGTGGAGA CAOCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCGAGAGCA GCAGGTCACG CAGAAAAAAC TG AGCAGCGC 3660 
CAACTCCCTG CCAGCGGGCG AGCAGGACAG TCCGCGCCTG GGGGACGCTG GCTGGCTCG A 3720 
CTGGCAGAGA GAGCGCTGGC AGATCTGGGA GCTCCTGTCG ACCGACAACC CCGATGCCCT 3780 
GCCCGAGACG CTCGTCTG_AG CCCGCACCCA GCCGAGCCCC CCCTGCCCCG AGCCCCCCGC 3840 
CCICCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAGTGT TCTTCTTTCA 3900 
CACTTCTCAA AAGTGACACA AGAG AAATCC AGTTCACCTA CAG AGGTAG A GCACTCACGC 3960 
CCCCGCCATT GAGAATAAGG TTOCATTGCG TAGCCAGOCT TAGGAAAAAC AAACAGAACC 4020 
CAAACCAGAT GGCAATGTCC AATCTAAAAA CGTCCCTCTT GGCTCTATAA TATAAGATAC 4080 
AACTCTTGCT TGGTATAGCC TAACOGTATT TATGTGTCTT CGGTTTTGAC TATTGTGTAT 4140 
TCTGTAACAG ATT ATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAATGAG ATATACTAAA CAATGAGATT 4260 
CTATAGAATG TTCTAGAATG TGCACAAGCG GGTTTCTGTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA OOCTTCCTTC GATACCAAAC ACTAACAAGA GGAAGCAGAA TATGAGAAGC 4380 
CATATTTTTA CATAGGAGTC AGATACAAAA AGAAAAATCA CTGAATGCTT TTAGATATTG 4440 
AATACGTTTT CAGGAAAATG CTAAATCTGA TAG ATTAOGA AATATATTTT TAGAACTTGT 4500 
TTAOAAAGOA TTCAGTTAAC CAAACAAGAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTCCGTC CCACGTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA ATTTTTATGT ATTTTTCAAA ACTACAAACT GGAATCCAAC TATAAAGTGT 4680 
TTAAG AATCT ACACAGAATA TTCAAATTAT AG AACATGTT TTTTCCCTTT GCCCCATAAT 4740 
CAGTATTTGC CAAATTACAT GCAATTCCTT AAAAACTAAA TCACATTGGT AAAAGG CCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGCCT OAQOAAATGT TTTCTTTCGA ATTTTTATGT 4860 
GTATTGTAAA ATGTTCTACC GTACTTTAOT AGTTTGAAGT TTTCAAGTGC ATAACTATTT 4920 
TTGACCAOCA OAAOOCOATA CGCTTCAGTA TTTTATGCAA TTTTTTTTCA CTTCG AAGGO 4980 
AAAGTGTATT ATAAAAAAAG ATTTTTTTTT 1 1 1 AAAACAT GCTACTCTTA ATTTTCATGT 5040 
TGGTGATGAA ATTGCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATGAATAAAA 5100 
AGTTAAGCAA AAAAAAAAAA AAAAAAAAAA AAA 



SEP ffi MO:146 PF G8 Pirtdn semravsg 
ProHn Accession f: NPJ08288.1 

1 II 21 31 41 51 
I I I I I I 

MSAQSIXHSV FSCSSPASSS AASAKGFSKR KLRQTR5LOP AUGGCGSDE AGAEGS ARGA 60 
TAGRLYSPSL PAESLGHILA SSSRGPPPRA TRLPPPGPLC SSFSTPSTPQ EKSPSGSFHF 120 
DYEVPLGRGG LKKSMAWDU* SVLAGPASSR SASSHjCSSG GGPNGIFASP RRWLQQRKFQ 180 
SPPDSRGHPY WWKSEGDFT WNSMSGRSVR LRS VPIQSLS ELERARLQEV PFYQLQQDCD 240 
LSCQmPKD GQKRKKSLRK KLDSLGKEKN KDKEFIPQAF GMPLSQVIAN DRAYKLKQDL 300 
QRDEQKDASD FVASLLPFGN KRQNKELSSS NSSLSSTSET PNESTSPNTP EPAPRARRRG 360 
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AMSVDSrTDL DDNQSRLLEA LQLSLPAEAQ SKKEXARDKJC LSLNPIYRQV FRLVDSOCQR 420 
LEKHOLQTVO IFRVGSSKKR VRQLREEFDR GIDVSLEEBH S VHDVAALLK EFLRDMPDPL 480 
LTRELYTAH NTLIXEPEEQ LQTLQLUYl. LPPCNCDTLH RUXJFLSIVA RHADDNISKD 540 
GQEVTGNKMT SLNLATIFGP NLLHKQKSSD KEFS VQSS AR AEESTAQA.V VQKMD2NYEA 600 
5 LfMVPPDLQN EVUSLLETD PDWDYLLRR KASQSSSPDM LQSEVSFSVG GRHSSTDSNK 660 
ASSGDISPYD NNSPVLSERS LLAMQEDAAP GGSEKLYRVP GQFMLVGHLS SSKSRESSFO 720 
PRLGKDLSEB PFDIWOTWHS TLKSGSKDPG MTGSSGDIFB SSSLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQGARRT QAAAPATEGR AHFAVSRACS TPHVQVAGXA ERPTARSEQY 840 
LTLSGAHDLS ESELDVAGLQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKP AAAAAW1QGP 900 
10 PEGVETPTDQ GGQAAEREQQ VTQKKLSSAN SLPAOEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPD ALP ETLV 



15 SEQ 10 NO:147 PF04 DNA SEQUENCE 

rtatelc Add Accession I-. N1UXB202 

CaSngsequencs: 240-1289 (undoHnedseqieflcescanespandtoslaitanildapoadans) 



20 1 11 21 31 41 51 
I I I I I I 

CCCCCGAGCC GCGCCGAGTC TGCCOCOGCC GCAOCCKXTC CGCTCOGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTCCTAOAT CCOCOAGGCC GOGGCGCAGC CG AOCAGCGG CTCTTTCAOC 120 
ATTGGCAAGC CCAGGGGCCA ATATTTOOCA CTTAGCCACA GCTCCAGCAT CCTCTCTGTG 180 

25 GGCTGTTCAC CAACTGTACA ACCAOCATTT CACtGTGGAC ATTACTCOCT CITACAG ATA 240 
1GGGAGACAT GGGAGATCCA CCAAAAAAAA AACGTCTOAT TTCCCTATGT GTTGGTTGCG 300 
GCAATCAG AT TCAOGATCAG TATATTCTOA GGGTTTCTCC GGATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGCTTTGTTA 420 
GGGATGGGAA AACCTACTGT AAAAGAGATT ATATCAGGTT GTACGGG ATC AAATGCGCCA 480 

30 AGTGCAGCAT CGGCITCAGC AAGAACGACT TCGTGATGCG TGCCCGCTCC AAGGTGTATC 340 
ACATOGAGTG TTTCCGCTGT GTGGCCTGCA GCCGCCAGCT CATCCCTGGG GACGAATTTG 600 
CGCTICGGG A GGACGGTCTC TTCTGCOQAG CAGAOCACGA TGTGGTGGAG AGGGCCAGTC 660 
TAGGCGCTGG CGACCCGCTC AGTCCCCTGC ATCCAGCGCG GCCACTGCAA ATGGCAGCGG 720 
AGCCCATCTC CGGCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGGAGAAGA 780 

35 CCACCCGCGT GOGGACTGTG CTGAACGAGA AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GOGGOCAGAT GCGCTCATG A AGG AGCAACT GGTAGAGATG ACGGGCCTCA 900 
GTCCCCGTGT GATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCGAAGCA 960 
TCATG ATG AA GCAACTOCAG CAGCAGCAGC CCAATGACAA AACTAATATC CAGGGGATG A 1020 
CAGGAACTCC CATGGTGGCT GCCAGTGCAG AGAG ACACGA OGGTGGCTTA CAGGCTAACC 1080 

40 CAGTGG AAGT ACAAAGTTAC CAGGCACC1T GG AAAGTACT GAGCGACTTC GGCTTGCAGA 1140 
GTGACATAGA TCAGCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GGACCGGGCT 1200 
CTAATTCCAC TGGCAGTGAA GTAGCATCAA TGTOCTCTCA ACTTCCAGAT ACACCTAACA 1260 
GCATGGTAGC CAGTOCTATT GAGGCATGAG GAACATTCAT TCTGTATTTT TTTTCGCTGT 1320 
TGGAGAAAGT GGG AAATTAT AATGTCGAAC TCTGAAACAA AAGTATTTAA CGACCCAGTC 1380 

45 AATGAAAACT GAATCAAGAA ATGAATGCTC CATGAAATGC ACGAAGTCTG TTTTAATGAC 1440 
AAGGTGATAT GGTAGCAACA CTGTGAAGAC AATCATGGGA TnTACTAGA ATTAAACAAC 1500 
AAACAAAACG CAAAAOCCAG TATATGCTAT TCAATGATCT TAGAAGTACT GAAAAAAAAA 1560 
GACGTTTTTA AAACGTAQAG GATTTATATT CAAGGATCTC AAAGAAAGCA TTTTCATTTC 1620 
ACTGCACATC TAGAGAAAAA CAAAAATAGA AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 

50 TGCTGTTTCT ATATTGGTCA TTGCCTTGCC AAACAGGAGC TCCAGCAAAA GCGCAGGAAG 1740 
AGAGACTGGC CTCCITGGCT GAAAGAGTCC TTTCAGGAAG GTGGAGCTGC ATTGGTTTGA 1800 
TATGTTTAAA GTTGACTTTA ACAAGGGGTT AATTG AAATC CTGGGTCTCT TGGOCTGTCC 1860 
TGTAGCTGGT TT A1 1 1 11 lACnTGCCCCCTCCCCACl 1 1 11 1 1U AGATC CATOCTTTAT 1920 
CAAGAAGTCT GAAGCX3ACTA TAAAGGTTTT TGAATTCAGA TTTAAAAACC AACTTATAAA 1980 

55 GCATTGCAAC AAGGTTACCT CTATTTTGCC ACAAGOGTCT GGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGTTTT 2100 
CTCTCTCTAT GG AAATAAAA AGGAAAAAAA AAAGGAAACT TTnTTGTTI GCTCTTGCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGG AAG ACTTGCCACT TTTCATGTCA 2220 
TTTGACA1 1 1 1 1 1 GT1 1GCT G AAGTGAAAA AAAAAGATAA AGGTTGTACO GTGGTCTTTG 2280 

60 AATTATATGT CTAATTCTAT GTGTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA GAATTATATC TTCAGGACTA TTTCACTAAT AAACATTTGG CATAGAT 



65 SEQ 10 WO:148 PF G4 Protein seouence: 
Protein Accession*: NP.0021911 

1 11 21 31 41 51 
„ I I I I I I 

70 MGDPPKKKRL ISLCVGCGNQ IHDQYTLRVS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDYIR LYGIKCAKCS 1GFSKNDFVM RARSKVYHIE CTRCVACSRQ UPGDEFALR 120 
EDGLFCRADH DWERASIGA GDPLSPLHPA RPLQMAAEPI S ARQPALRPH VHKQPEKTTR 180 
VRTVLKEKQL HTLRTCYAAN PRPDALMKJEQ LVEMTGLSPR VKVWFQNKR CKDKKRSIMM 240 
KQLQQQQPND KTNIQGMTGT PMVAASPERH DGGLQANPVE VQSYQPPWKV LSDFALQSDI 300 

75 DQPAFQQLVN FSEGGPGSNS TGSHVASMSS QLPDTPNSMV ASPIEA 
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SEO ID Kft«9 PFG2 DNA SEQUENCE 

Nudelc Add Accession t: NM_001172 

Coding sequence: 39-1103 (underlined sequences correspond to start and stopcodons) 

I 11 21 31 41 31 
I I I I I I 

GCGGAGCTCT GOCTTGG AG A TTCTCAGTGC TGCGG ATCAT GTOCCTAAGG GGCA-GCCTCT 60 
CGCGTCTCCT CCAGACGCGA GTGCATTCCA TCCTGAAGAA ATCOQTCCAC TCCGTCGCTO 120 
TGATAGGAGC OCOGTTCTCA CAAGGGCAGA AAAGAAAAGG AGTGGAGCAT GGTCCCGCTG 180 
CCATAAGAGA AGCTGGCTTO ATGAAAAGGC TCTCCAGTTT GGGCTGCCAC CTAAAAGACT 240 
TTGGAGATTT GAGmTACT CCAGTCOOCA AAGATGATCT CTACAACAAC CIGATAGTGA 300 
ATCCACGCTC AGTGGGTCTT GCCAAOCAGG AACTGGCTGA GGTGGTTAGC AGAGCTGTGT 360 
CAGATGGCTA CAGCTGTGTC ACACTGGGAG GAGAOCACAG CCTGGCAATC GGTAGCATTA 420 
GTGGCCATGC CCGACACTGC OCAGA OC 111 GTGTTGTCTG GGTTQATGOC CATGCTGACA 430 
TCAACACACC CCTTACCACT TCATCAGGAA ATCTCCATOG ACAGCCAGTT TCATTTCTCC 540 
TCAGAG AACT ACAGGATAAG GTACCACAAC TCCCAGGATT TTCCTGG ATC AAACCTTGTA 600 
TCICITC TO C AAOTATTGTG TATATTGOTC TOAOAGAOGT GOACCCIPCT GAACATTTTA 660 
TTTTAAAOAA CTATGATATC CAGTATTTtT CCATGAO AGA TATTGATCGA CTTGGTATCC 720 
AQAAGGTCAT GGAAOGAACA TTTGATCTGC TOATTGGCAA GAGACAAAGA CCAATOCATT 780 
TGA GTI 1 IO A TATTGATGCA TTTGACCCTA eACTOGCTOC AGCCACAGGA ACTOCTGTTO 840 
TCGGGGGACT AACCTATCQA GAAGGCATGT ATATTCCTGA GGAAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGGATCTT GTTGAAGTCA ATCCTCAGTT GGOCACCTCA GAGGAAOAGG 960 
OGAAGACTAC AGCTAACCTG GCAGTAGATO TOATTCCTTC AAGCTTTGGT CAGACAAG AG 1020 
AAGGAGGGCA TATTGTCTAT GACCAACTTC CTACTCCCAG TTCACCAGAT G AATCAGAAA 1080 
ATCAAGCACG TOTGAGAATTIAGGAGACAC TGTGCACTGA CATQTTTCAC AACAGGCATT 1140 
CCAGAATTAT GAGGCATTGA GGGGATAGAT GAATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GCCITAATOA GAACATITAC ACATTCTCAC AATTGTAAAG TTTCCCCTCT ATTTTGGTGA 1260 
CCAATACTAC TGTAAATGTA TTTGGTnTT TGCAGTTCAC AGGGTATTAA TATGCTACAG 1320 
TACTATGTAA ATTTAAAGAA GTCATAAACA GCATTTATTA CCTTGGTATA TCATACTGGT 1380 

cn u riGcro nGnccnc acatttaagt gg ititi cat c i i ilxj iu oc tcctcccaca 1440 

GCCTGGCTAT ACAGTGCATC CTTGAACTGT CAGCCCACAG CAGCAATATG CTTATTCTAT 1500 
CCACATCCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTCCACAAA CCCTTCCCTA 1560 
TAGAAGTTCA ATGGCTGCGA AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCG AGCT 1620 
OCAGTAAGAT G ATAATGGAA AGCAGCAGCT TGTTGGTTCT CACTCTACAA AGAGAAGCAA 1680 
AGTGGGGAGT AGTCAGAAGT TTGGATAACC TTCCI IC TAA ACATTTGGGG GTTAGACCTC 1740 
GGACCACGGC TGGATACICT GAGGCTGTAT GTTTGATCAC ACAGCCACTT AGCAGGAAGT 1800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG GATAACACTG TCTACCTCAC AGAAATGTTA 1860 
AACTGAGACA ATAAAACCCA AAGCAT 



SEfl ID NO:150 PFG2 Protein senuaice: 
PratetaAocesslpnft NP_001163.1 

1 11 21 31 41 51 
I I I I I I 

MSLRGSLSRL LQTRVHSHJC KSVHSVAV1G APFSQGQKRK GVEHGPAAJR EAGLMKRLSS 60 
LGCHLKDPGD LSFTPVPKDD LYNNUVNPR S VGLANQELA EWSRAVSDG Y5CVTLGGDH 120 
SLAKTnSGH ARHCPDLC W VATJAHADINT PLTTSSGNLH GQPVSFIXRE LQDKVPQLPG 180 
FSWKPCISS ASIVYIGLRD VDPPEHFUJC NYDIQYFSMR DIDRLOIQKV MERTFDLXiG 240 
KRQRPIHLSF DIDAFDPT1A PATGTPWGG LTYREGMYIA EHHNTGLLS ALDLVEVNPQ 300 
LATSEEEAKT TANLAVDVIA SSFGQTREGG WVYDQLPTP SSPDESENQA RVRI 



SE0IDK0:1$1 PPfll DNA SEQUENCE 

Nucleic Add Accession i: NM.017908 

Coding sequence: 80-1255 (imdenTned sequences comspond to slart and stopcodons) 

1 11 21 31 41 51 

I I I I I I 

AATTATATAT TTTTACTCTA TU1 1 1C TCTA CAT Ci 1111 11 TCTTTCCGTT GCTGGOGGAA 60 
GAGGCACGTG CXiCTGCTGA A TG OAGCTGGT CGCTGGTTGC TACGAGCAGG TCCTCTTTGG 120 
GTTCGCTGTA CACCCGGAGC CCAAGGCTTG CGGCGACCAC GAGCAATGGA CTCTTGTGGC 180 
TGACrrCACT CACCATGCTC ACACTGCCTC CTTGTCAGCA GTAGCTQTAA ATAGTCGTTT 240 
TGTGGTCACT GGGAGCAAAG ATGAAACAAT TCACATTTAT GACATGAAAA AGAAGATTG A 300 
GCATGGGGCT CTAGTGCATC ACAGTGGTAC AATAACTTGC CTG AAATTCT ATCGCAACAG 360 
GCATTTAATC AGTGGAGCGG AAGATGGACT CATCTGTATC TGGGATGCAA AGAAATGGGA 420 
ATGCCTGAAG TCAATTAAAG CTCACAAAGG ACAGGTGAOC TTOCnTCTA TTCACCCATC 480 
TGGCAAGTTG GCCCTGTCGG TTGGTACAGA TAAAACTTTA AGA ACGTGG A ATCTTGTAG A 540 
AGGAAGATCA GCA1TCATAA AAAATATAAA ACAAAATGCT CACATAGTAG AATGGTCOCC 600 
AAGAGGAGAG CAGTATGTAG TTATCATACA GAATAAAATA GACATCTATC AGCTTGACAC 660 
TGCATOCATT AGTGGCAOCA TCACAAATGA AAAGAGAATT TCCTCTGTTA AA TTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGGAGATGA AG AAGTTATA AGGTTTTTTG ACTGTGATTC 780 
ACTAGTGTGC CTCTCCGAAT TTAAAGCTCA TGAAAACAGG GTAAAGGACA TGTTCAGTTT 840 
TGAAATTCCA OAGCATCATG TTATTGTTTC AGCATCGAGT GATGGTTTCA TCAAAATGTG 900 
OAAGCITAAG CAGGATAAGA AAGTTCCCOC ATCTTTACTC TGTGAAATAA ACACTAATGC 960 
CAGGCTGACG TGTCTTGG AG TGTGGCTAGA CAAAGTGGCA GACATGAAAA GCCTTCCTCC 1020 
AGCTGCAGAG CCTTCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGGAGCCTGG 1080 
TGACACAGTG CACAAAGAAG AAAAGCGGTC AAAAOCTAAC ACAAAGAAAC GCGGTTTAAC 1140 
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AGGTGACAGT AAGAAAGCAA CAAAAGAAAG TGGCCTGATA TCAACCAAGA AGAGGAAAAT 1200 
GOTAGAAATO TTGGAAAAGA AGAGGAAAAA GAAGAAAATA AAAACAATGC AGIQAATCAC 1260 
AO AlOTCT CC TOAAAOAACT CTTTTAOATO AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
1 1 1 1 1 1 I IC C CTGAGTAAAA GCAAOAAATTTCI rCCTTTO OAAAAAATAT ATATATTAAA 1380 
AAACCACTTT TAG ATGGTTT TTTTTAAAAA AAAAAAAAAA ACTGGTAAAA TTA L'l 1 1 IG G 1440 
CAGACAGTGT TTTATGAATT ATGTATCATC TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTTTTACTr TOTACAAAGC AAATAAAO AT CTTTCTCAAA AAAAAAAAAA AAAA 



10 



25 



SEP ID Kft152 PFG1 Protein sequence: 
Protein Accession I: NP.06037&1 



. „ 1 11 21 31 41 51 

15 | | | | | | 

MELVAGCYEQ VLFGFAVHPE PKACGDHEQW TLVADFTHHA HTASLSAVAV NSRFVVTGSK 60 
DBnHIYDMK KKIEHGALVH HSGTTTCLKF YGNRHLISGA EDGLIOWDA KKWECLKSK 120 
AHKGQVTFLS IHPSGKLALS VGTDKTLRTW NLVEGRSAH KNDCQNAHIV EWSPRGEQYV 180 
__ VnQNKIDIYQLDTASlSGT 1TOEKR1SSV KFLSESVLAV AGDEEVIRFF DCDSLVCLCB 240 
20 FKAHENRVKD MESFEIPEHH VIVSASSDGF IKMWKLKQDK KVPPSLLCH NTNARLTdjG 300 

VWLDKVADMK SLPPAAEPSP VSKEQSKIGK KEPG0TVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGUSTK KRKMVEMLEK KRKKKKHOM Q 



SEQ ID N0:153 PFD6 DNA SEQUENCE 

Nucleic Add Accession* NMJ14S68 

Coding sequence: 1102953 (underlined sequences correspond to stert and stop codons) 



30 1 11 21 31 41 SI 
I I I I I I 

GATGTCTTGG ACATGCTCTO GCTGGCTAAT CTCCATGTTC TAGCCGACTG AAAATACGGT 60 
GGGCAAGTGG ATGGTGTGCT TATTTGCAGT CTAAAQAAAT TTCCTTTT G A TGT GGCAOAA 120 
_ AATCGAGGAT GTGGA'GTGGA GACCCCAGAC TTACTTGGAG CTGGAGGGTC TGCCTTGCAT 180 

35 CCTGATCTTC AGTGGG ATGG ACCCGCATGG GGAGTCCTTG CCGAGGTCTT TGAGGTACTG 240 
TGACCTGCGA TTG ATAAACT CCTCCTGCTT GGTG AGAACA GCCTTGGAGC AGG AGCTGGG 300 
CCTGGCTGCC TACTTTGTG A GCAACG AGGT TCCCTTGGAG AAGGGGGCTA GGAACG AGGC 360 
CTTGGAGAGT GATGCTGAGA AGCTG AGCAG CACAGACAAC GAGGATGAGG AGCTGGGG AC 420 
AGAAGGCTCT ACCTCGGAGA AG AGAAGCCC CATG AAAAGG GAGAGGTCCC GCTCCCACGA 480 

40 CICAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GCGCTCGGTG GCG AGTGCTC 540 
GGCTCAGCCC ACAGCACTCC CCCAGGGAGA GCATGCCAGG TCGCCCCAGC CCCGTGGCCC 600 
CGCAGAGGAG GGCAGAGCCC CTGGTG AGAA ACAGAGGCCC CGGGCAAGTC AGGGGCCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGCCCCAG CCCGACTGTA GCCTCAGGAC 720 
CGGCCAGAGG AGCGTCCAGG TGTCGGTCAC CTCGTCGTGC TOCCAGCTGT CCTCCTCCTC 780 

45 GGGCTCATCC TCCTCATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTCCTTG ACCAAGGCCT GCCGCCAGCC ACCCATTGTC TTCTTGCCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG CCTGCCCAAG GCCGCCTCCC TCCTGCCCTC 960 
CCCCTCGGTC ATGTGGGGCA GCTCTTTCCG CCCCCTGCTC AGCAAGACCA TGACATCCAC 1020 

„ CGAGCAGTCC CTCTACTACC GGCAGTGG AC GGTGOCCCGG CCCAGCCACA TGG ACTACGG 1080 

50 CAACCGGGCC GAGGGCCGCG TGGACGGCTT CCACCCCCGC AGGCTGCTGC TCAGCGGCCC 1140 
CCCTCAGATC GGGAAGACAG GTGCCTACCT GCAGTTCCTC AGTGTCCTGT CCAGGATGCT 1200 
TGTTCGGCTC ACAGAAGTGG ATGTCTATGA CGAGGAGGAG ATCAATATCA ACCTCAGAGA 1260 
AGAATCTGAC TGGCATTATC TCCAGCTTAG CGACCCCIGG CCAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCCC TTTG ACTACA TCATTCACGA CCCGAAGTAT GAAG ATGCCA GCCTGATTTG 1380 

55 TTCGCACTAT CAGGOTATAA AGAGTGAAGA CAGAGGGATG TCCCGGAAGC CGGAGGACCT 1440 
TTATGTGCGG CGTCAGACGG CACGGATGAG ACTGTCCAAG TACGCAGCGT ACAACACTTA 1500 
CCACCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTCCAC CCCCGCTACC AGCTGTATGA 1560 
GTCCACCCTG CACGCCTTTG CCTTCTCTTA CTCCATGCTA GGAGAGGAGA TCCAGCTGCA 1620 
CTTCATCATC CCCAAGTCCA AGGAGCACCA CTTTGTCTTC AGCCAAOCTQ GAGGCCAGCT 1680 

60 GGAGAGCATG CGACTACCCC TCGTGACAGA CAAGAGCCAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA ACCACCGGCC GTCACGAACA TGGGCTCTTT AATCTGTACC ACGCAATGGA 1800 
CCGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGG AATAC GAG ATGGCAA TTTATAAGAA 1860 
ATATTGGCCC AACCACATCA TGCTGGTGCT CCCCAGTATC TTCAACAGTG CTGG AGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGGAGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACCG 1980 

65 GCAGGAGGAG CTGGGAATCA AGCCGCAGGA CATCTGGCCT TTCATTGTGA TCTCTGATGA 2040 
CTCCTGCGTG ATGTGGAACG TGGTGGATGT CAACTCTGCT GGGGAGAGAA GCAGGGAGTT 2100 
CTCCTGGTCG GAAAGGAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTAOGCCC TGCTGGGCCT GCGGAAGTGG TCCAGCAAGA CCCGGGCCAG 2220 
CGAGGTGCAA GAGCCCTTCT CCCGCTGCCA CGTGCACAAC TTCATCATCC TGAACGTGGA 2280 

70 CCTGACCCAG AACGTGCAGT ACA ACCAG AA CCGGTTCCTG TGTGAOGATG TAG ACTTCAA 2340 
CCTGCGGGTG CACAGCGCCG GCCTCCTGCT CTGCCGGTTC AACCGCTTCA GCGTG ATGAA 2400 
GAAGCAGATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGG TGTCTGATAA 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 
CGCAGCGCCC GCCCAGCTCC TGCTGGAGAA GTTCCTGCAG CACCACAGCC ACCTCTTCTT 2580 

75 CCCGCTGTCC CTGAAGAACC ATGACCACCC AGTGCTGTCT GTCG ACTGTT ACCTGAACCT 2640 
GGGATCTCAO ATTTCTCfl 1 1 GCTATGTGAG CTCCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTCGGACTTG CTGTTCAGTG GGCTGCTGCT GTACCTCTGT GACTCTTTTG TGGG AGCTAG 2760 
CTTTTTGAAA AAGTTICATT TTCTGAAAGG TGCGACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG CGCCAGACGG TCGTCCGCCT GGAGCTCGAG GACGAGTGGC AGTTCCGGCT 2880 
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GCGCGATGAG TTCCAGACCG CCAATGCCAG GGAAGACCGG OCGCTCTTTT TTCTGACGGG 2940 
ACGACACATC TGAGGAAGAC AGCGGCGAGT TTTCTQAAGA GATOAGTGCT CAGAGCCCTC 3000 
ATGCTGTTOA GGCTAAAGGG AGGCCTGGAA OGGTGGGGCG TTTOACTGGA ATGGACOOCA 3060 
GGGACTGTCC AGGTGCAGCC CCTCCTAGTA CACATGGGCC CCOGAGGCCG TGGTCCTGGG 3120 
AGCCAGG AAG ACTCCGCAGT GGGTG AG AAT GAAAACTTGA G ACTCCCAAG TTCTGGGCCA 3180 
GOCCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCACGAGG AACAAAGATT TACTTCCTGT 3240 
CCTGCCATTC GTGTGCTTCC ATGGACAAAC CTO ATTTI 1 1 TCTCTTAGTT CTAAAGAATC 3300 
TTGGGTTATT TTGTAGCGGT GCCAGTATTT CAGTAGATGG GATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAAOC TCCTACAAAG CAATATTCCA AAGG AACATT TTAACIGTAA A GGCT GGAGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AGTTTGCTGT 3480 
TTCAGAGGCT AGCCCAAAGG CATCAAATTT AATAAAOTTA AACAAATTGA TTTACTTCAG 3540 
AGCAAATATG ATOCTATTAA AATAATATAG GGTAAATACC CTACCTCTTA GAAAGGGCAA 3600 
AAATGCAAAG AAGCTTTCTT TAAAACTAAA AGGG1 11111 GGGGGGGGAG TTGGCGGGGA 3660 
GGAAATAAGG CTAACAGAGG TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGACCACAT 3720 
TGCTTACTTO AAACAGACAA TGAAAACAAC CAAAGTGATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT GACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTCATTTT 3840 
CTCTTACTAA TGTTTGGTTC CTCAGGGAAG AATCTCACTT GACTAGAGAG GAGGTGGGAA 3900 
CAGAAGAGAQ AAGG AGGCAG GG AGATGTAT TTCTTAGGGC TCACCCCTTC ACAGACTGAC 3960 
AGAATGG1 1 1 rGTTITGTrT TGTTTTGTTT TGTTTTOTn 1 1GAGATGGA CTCTAGCTCT 4020 
GTCACCCAGG CTGGAGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGO 4080 
TTCTCACCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGCG CCCACCACCA 4140 
CGCCCGGCTA AT T TTTTGTA TTTTTTAGTA GAGACGGGGT TTCACCATGT TAGCCAGG AT 4200 
GGTCTCGATC TCCTGACCTC GTGATCCGCC CGCCTCGGCC TCCCAAAGTG CTGGGATTAC 4260 
AGGCGTGAGC CACCGTGCCT GCCCCAGAAT GGTTTTTAAA GCCACAGTTG AGAGGCCACC 4320 
CATTGOOCGG CGCCTGGACA GTGATCATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTG 4380 
ATTGGAATTA TTCATCCOCT TTGAAAGATG AGAAGGTTGA GATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTCGAA AGAGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGGACTC 4500 
TGCAGTCCAG GTCTCCCTTC TOCCACTTGC CTACCCTCAA TGCCACACTG TTTTTGAAGT 4560 
GGCOCATAAC TTG AAGG AAA AGTTTAAAGA CAGTrCAATT TAATCATCAG AATGCATTCT 4620 
1U1T1T11C GGAGACGGAG TTTCACTCTT GCTGCCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGOTTCAAGT GATTCTCCAG CCTCAGCCTC 4740 
CCGAGTAGCT GGGATTATGG GCGCCCACCA CCATGOCCAG CTAATTTTTG TATTTTTTTT 4800 
TTTTAGTAGA GATGGGGTTT CGCCAGGTTG GCCAGGCTGG TCTTGTGAAC TCCTGGCCTC 4860 
AGGTG ATCTG CCCACCTCAT CCTCCAAAAG TGCTGGGATT ACAGGCATGA GCCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCTAGACAT TTATAAGCAC TCTAATGQAT 4980 
AACAATCCAA GAATAAATGA TTGTAAAAGA TGATGCCGAA GAGTTGATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTCCG CGAGTATTAA ATATTTAGAT CAATGTTTAT AAAATGATTA 5100 
CTTTGTATAT CTCATTATTC CTATTTTGGA ATAAAAACTG ACCTTCTTTA ATCATATACT 5160 
TGTCTTTTGT AAATAGCAGC TTTTGTGTCA TTCTCCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



SEQ ID HM5< PFD6 Protein scouence: 
Protein Accession I: NPJB5483.1 

1 11 21 31 41 51 
I I I I I I 

MWQKEDVEW RPQTYLELEG LPCHJFSGM DPHGESLPRS LRYCDLRUN SSCLVRTALE 60 
QELGLAAYFV SNEVPLEKGA RNEALESDAB KLSSTDNEDE ELGTEGSTSE KRSFMKRERS 120 
RSHDSASSSL SSKASGSALG GESSAQPTALPQGEHARSPQ PRGPAEEGRA PGEKQRPRAS 180 
QGFPSAISRH SPGPTPQPDC SLRTGQRSVQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QASQCSLTKA CRQPPIVFLP KLVYDMWST DSSGLPKAAS LLPSPSVMWA SSFRPLLSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDGFHPRRLL LSGPFQIGKT G A YUJFLS VL 360 
SRMLVRLTEV DVYDEEHNI NLREESDWHY LQLSDPWPDL ELFKKLFFDY IHDPKYEDA 420 
SUCSHYQGI KSEDRGMSRK PEDLYVRRQT ARMRLSKYAA YNTYHHCEQC HQYMGFHPRY 480 
QLYESTLHAF AFSYSMLGEE IQLHFUPKS KEHHFVFSQP GGQLESMR1P LVTDKSHEYI 540 
KSPTFIPTTG RHEHGLFNLY HAMDGASHLH VLWKEYEMA IYKKYWPNHI MLVLPSIFNS 600 
AGVGAAHFU KELSYHNIEL ERNKQEELGI KPQDIWPFIV ISDDSCVMWN WDVNSAGER 660 
SREFSWSERN VSLKHIMQHI EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNFn 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLRVHSA GLLLCRFNRF SVMKKQIWG GHRSFHTTSK 780 
VSDNSAAWP AQYICAPDSK HTFLAAPAQL LLEKFLQHHS BLFFFLSLKN HDHPVLSVDC 840 
YLNLGSQIS V CYVSSRPHSL NISCSDLLFS GUXYIjCDSF VGASFLKKFH FUCGATLCVI 900 
CQDRSSLRQT WRLELEDEW QFRLROEFQT ANAREDRPLF FLTGRM 



SEQ 10 N0:1S5 PFC6 DNA SEQUENCE 

Nudelc Add Accession r. NM.0005Z2 

Coding sequence: 1-1 167 (underlined sequences correspond to start and stop colons) 
1 II 21 31 41 51 

LeaCAGCCT CCGTGCTCCT CCACCCCCGC TGG ATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG GCGGCCTGGT GGCCGACGAG CTCAACA AGA ACATGGAAGG GGCGGCGGCG 120 
GCTGCAGCAG CGGCTGCAGC GGCGGCGGCT GCCGGGGCCG GGGGCGGGGG CTTCCCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TCGGTGGCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACCAGTO CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCCAGGA 300 
GCCGCGTCCG CCTACAGCAG CGCCCCCGGG GAGGCGCCCC CGTCGGCTGC CGCCGCTGCT 360 
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GCCGCGGCTG CCGCTGCAGC CGCCGCCGCC GCCGCOGCGT CGTCCTOGGG AGGTOCCGGC 420 
CCGGCCGGCC OGGCGGCGGC AGAGGOGGCC AAGCAATGCA GCCCCTGCTC GGCAGCGGCG 480 
CAGAGCTDGT CGGGGCCOOC GGCGCTGCCC TATGGCTACT TCGGCAGCOG CTACTACCCG 540 
TGCGCOCGCA TGGGCCCGCC CCCCAACGCC ATCAAGTCGT GCCCCCAGCC CCCCTCGGCC 600 
GOOGCCGCCG CCGCCTTCGC GGACAAGTAC ATGGATACCO CCGGCCCAGC TGCOOAGGAG 660 
TTCAGCICCC GCGCTAAGG A GTTCGCGTTC TACCACCAGG GCTAOGCAGC OGGGOCTTAC 720 
CACCAOCATC AGCCCATGCC TGGCTACCTO OATATGOCAO TGOTCOCGGG CCTCGGGGGC 780 
CCOGGCGAGT OGCCCCACGA ACCCTTGGC3T CTTCCCATGO AAAGCTACCA GCCCTGGGCG 840 
CTGCCCAACG GCTCGAACCG CCAAATGTACTOCCCCAAAO AGCAOGCGCA GCCTCCCCAC 900 
CTCTGGAAGT CCACTCTGCC CGACGTGGTC TCCCATCCCT CGGATGCCAG CTCCTATAGG 960 
AGGGGGAGAA AGAAGCGCOT GCCTTATACC AAGGTGCAAT TAAAAGAACT TQAACGGGAA 1020 
TACGCCACGA ATAAATTCAT TACTAAGGAC AAACGGAGGC GGATATCAGC CAOGACGAAT 1080 
CTCTCTQAGC GGCAGGTCAC AATCTGGTTC CAGAACAGQA GGGTTAAAGA GAAAAAAGTC 1140 
ATCAACAAAC TGAAAACCAC TAGTTAA 



SEQ P KCfclSS PFC6 Protein seouencg 
Protein Accession* NP.000513.1 

1 11 21 31 41 SI 
I I I I I I 

MTASVLLHPR WEPTVMFLY DNGGGLVADE LNKNMEGAAA AAAAAAAAAA AGAGGGGFPH 60 
PAAAAAGGNP SVAAAAAAAA AAAANQCRNL MAHPAPLAPG AASAYSSAPG EAPPSAAAAA 120 
AAAAAAAAAA AAASSSGGPG PAGPAAAEAA KQCSPCSAAA QSSSGPAALP YGYPGSGYYP 180 
CARMGPPPNA KSCFQPPSA AAAAAFADKY MDTAGPAAEE FSSRAKEFAF YHQGYAAGPY 240 
HHHQPMPGYL DMPWPGLGQ PGESRHEPLG LPMESYQPWA LPNGWNGQMY CHCEQAQPPH 300 
LWKSTtPDW SHPSDASSYR RGRKKRVPYT KVQLKELERE YATNKFITKD KRRRISATTN 360 
LSERQVTIWF QNRRVKEKKV INKLKTTS 



SE0ID N&157 PFA3 DMA SEQUENCE 

Nudelc Acid Accession f: AW102723 

Coding sequence: 623-2676 (underfilled sequences correspond to start and slop codons) 

1 11 21 31 41 51 
I I I I I I 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 
GCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAG AAG CAGGTTICAG TTGCAG AGTT 120 
TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 
ACCTGTGGGG G AGGGAGCGC CTGGAGGAGC TTAG AGACCC CAGCCGGGCG TGATCTCACC 240 
ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGG AAGCA CAGCCCCGAG 300 
GTGTGOQAAG CCACCAAG AC TGCGGCTCTT GG AGAAAGCG TG AGCAGGGG GCCACCGCGG 360 
TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 
TACCAGTGTC CTTGAATTG A TAGTGGCTTC TGTTTCTCAG TCTGATATAA GAACTACAGC 480 
TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 
AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
TGTCAAGACA TTCCTGAGAA GAACATACAA G AAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 
AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 
GAACGGCTGA ATGTTGCACT TCAG AG AACA TTGGCAAAGC ACAAAATAAA AG AAAGCAGG 840 
AAATCTTTGG AAAGAG AAGA CTTTQ AAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 
CCAOTGGAGT TATCAAAGAA TCTCTTGGTG AAGAG Ullll' TAAAATATGT TACG AGGAAG 960 
ATO AAAACAT CCTTGGGGTG GTTGQ AGGCA CCCTTAAAG A TTTTTAAACA GCTTCAGTAC 1020 
CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 
TCCATTCTAT GCCTGG ATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 
AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATG AA 1200 
ACGG AAGTGG AAGTGTCGTT AATGCCTCCC TGCTTOCATA ATG ATTGCAG CGAGTTTGTG 1260 
AATCAGCCCT ACTTOI 1U 1 A CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 
AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAG AC ATTICCATTC 1380 
CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAG AAGGCTG 1440 
ATGAACAGGA G AGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 
AAAATCAACC AGAOCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
GTGAGGAGAT GGG ACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 
AtGATCTACA TTGTTGAATC CAGTGCAATC TTGTTnTGG GGTCACCCTO TGTGOACAGA 1680 
TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 
AGGGATGTGG TCTTAATAGG GGAACAAGCC GGAGCTCAAG ATGGCCTG AA GAAGAGGCTG 1800 
GGGAAGCTG A AGGCTACCCT TG AGCAAGCC CACCAAGOCC TGGAGG AGGA GAAG AAAAAG 1860 
ACAGTAGACC TTCTOTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 
CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 
TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 
TACACTCGCT TCGACGAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAOATA 2160 
GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTOATGAAO TTATGTCTCC CCATGGAGAA 2220 
CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC A ATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 
TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 
CCTGGTTTCO TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 
ATCCCCGGAA TCTGCCATTT TCTGG ATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 
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TTOCAAAAOA AAQ ATGTGOA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 
TTAGCAACCT ATATAOCTAT TTATAAGTCT TTGGGGTTTG ACTCATTO AA GATGTGTAGA 2640 
GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAQ CAGTATTAAA ATTTCAGG AG 2700 
CCAAGTCACA AlClllCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
TCTTCAAGAA AAAAAAAAA A ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 
AAOCAGCACT TACTACCTGT ACTCAA AATT CAGCACCTTG TACATATATC AGATAATTGT 2880 
AGTCAATTGT ACAAACTG AT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTGA TAGTTOTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SEQ ID Ha-158 PFA3 Protein sequence: 
Protein Accession I: NP.000847.1 



15 

1 11 21 31 41 51 
I I I I I I 

MPCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPIOQDI PEKNIQESLP 60 
QRKTSRSRVY LHTLAESICK UFPEFERLN VALQRTLAKH KIKESRKSLE REDFEKT1AB 120 
20 QAVQQSPVEL SKNLLVKRFL KYVTRKMKTS LGWLEAPLKI FKQLQYPSET EQPLPRSRKK 180 
GQLEDASILC LDKEDDFLHV yyFFPKRTTS ULPGHKAA AHVLYETEVE VSLMPPCFHN 240 
DCSEFVNQPV LLYSVHMKST KPSLSPSKPQ SSLVIPTSLPCKTFPFHFMFDKDMTILQFG 300 
NGIRRLMNRR DPQGKPNFEY FEILTPKINQ TFSGMTMLN MQFWRVRRW DNSVKKSSRV 360 
• MDUCGQMIY1 VESS AILFLG SPCVDRLEDF TGRGLYLSDI PIHNALRDW UGEQARAQD 420 
25 GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCS IFFCEVA QQLWQOQWQ AKKFSNVTML 480 
FSDIVGFTAI CSQCSPLQVI TMLNALYTRF DQQCGELDVY KVET1AMPIV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV MSPHGEPIKM RIGLHSGSVF AG WGVKMPR YCLFGNNVTL 600 
ANKFESCSVP RKWVSPTTY RLLKDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCFQKK DVEDASQFFR QSDXNRLATY fflYKSLGFD SLKMCRASES TLGIVDG 



SEQ ID NO:159 PFA1 DNA SEQUENCE 

Kudeic Add Accessions: KM.0O4362 

Coding sequence: 102-1934 (um)eitMsequerv^(XKrespand to sbrt and slop codons) 

1 U 21 31 41 SI 
I I I I I I 

CGCCGGCGGG ACTOGTCTO A AGAOACGCGG GGACAAAGTO GCAACGACTT GGACATCTG A 60 
GCTGTCACTO CCGAAAACAG GCCGCAAGAG AGATAATCAA TAJSCATTIC CAAGCCnTT 120 

40 GGCTATGTTr GGCICTTCTG TTCATCTCAA TTAATGCAOA ATTTATGGAT GATGATGTTG 180 
AGAOGGAAGA CTTTGAAGAA AATTCAGAAG AAATTGATGT TAATGAAAGT GAACTTTCCT 240 
CAGAGATTAA ATATAAGACA CCTCAACCTA TAGGAGAAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGG AAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA G AAAG ATGAC ATGGATGAGG 360 
AAATTTCAAT ATACGATGGA AGATGGGAAA TTGAAGAGTT GAAAGAAAAC CAGGTACCTG 420 

45 GTGACAGAGG ACIGGTATTA AAATCTAG AG CAAAGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT G ATAAACCCT TGATAGTTCA ATATGAAGTA AATTTTCAAG 540 
ATGGTATTG A TTGTGGAGGT GCATACA1TA AACTCCTAGC AGACACTG AT GATTTGATTC 600 
TGGAAAACTT TTATGATAAA ACATCCTATA TCATTATGTT TGGACCAGAT AAATGTGG AG 660 
AAGATTATAA ACTTCATTTT ATCTTCAGAC ATAAACATCC CAAAACTGGA GTnTOGAAG 720 

50 AGAAACATGC CAAACCTCCA GATGTAGACC TTAAAAAG 1 1 CI 1 1ACAGAC AGGAAGACTC 780 
ATCTTTATAC CCTTGTG ATG AATCCAGATG ACACATTTG A GGTGTTAGTT GATCAAACAG 840 
TTGTAAACAA AGGAAGCCTC CTAGAGGATG TGGTTCCTCC TATCAAACCT CCCAAAGAAA 900 
TTGAAGATCC CAATGATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTOCTGATC 960 
CTTCTGCCGT CAAACCAGAA GACTGGGATG AAAGTGAACC TGCCCAAATA GAAGATTCAA 1020 

55 GTGTTGTTAA ACCTGCTGGCTGGCTTGATG ATGAACCAAA ATTTATCCCT GATCCTAATG 1080 

CTGAAAAACC TGATGACTGG AATGAAGACA OGGATGGAGA ATGGGAGGCA CCTCAGATTC 1140 
TTAATCCAGC ATGTCGG ATT GGGTGTGGTG AGTGGAAACC TCCCATGATA GATAACCCAA 1200 
AATACAAAGG AGTATGGAGA CCTCCACTGG TCGATAATCC TAACTATCAG GGAATCTGGA 1260 
GTCCTCG AAA AATTCCTAAT CCAGATTATT TCGAAGATGA TCATCCATTT CTTCTGACTT 1320 

60 CTTTCAGTGC TCTTGGTTTA GAGCTTTGGT CTATGACCTC TGATATCTAC TTTGATAATT 1380 

TTATTATCTG TTCGGAAAAG GAAGTAGCAG ATCACTGGGC TGCAGATGGT TGGAGATGGA 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG GTGTATTAAA ACAGTTAATO GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTGACAGC AGGAGTGOCA ATAGCATTAA 1560 
TTACTTCATT TTGTTGGCCA AGAAAAGTAA AGAAAAAACA TAAAGATACA GAGTATAAAA 1620 

65 AAACCGACAT ATGTATACCA CAAACAAAAG GAGTACTAGA GCAAGAAGAA AAGGAAGAGA 1680 
AAGCAGCCCT GGAAAAACCA ATGGACCTGG AAGAGGAAAA AAAGCAAAAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGGAA AGTGAACCTG AGGAAAAGAG TGAAGAAGAA ATTGAAATCA 1800 
TAGAAGGGCA AGAAGAAAGT AATCAATCAA ATAAGTCTGG GTCAGAGGAT GAGATOAAAG 1860 
AAGCAGATGA GAGCACAGGA TCTGGAGATG GGCXX3ATAAA GTCAGTACGC AAAAGAAGAG 1920 

70 TACGAAAGGA CT^AACTAGA TTGAAATATT TTTAATTCCC GAGAGGATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCCAGACC TGAACTTTAA TCAGTCTGCA CATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TAGTCCTTCA TTTCCG AGG A AAAAG AAGCA 2100 
ACTTTGAAGT TACCTCATCT TTG AATTTAG AATAAAAGTG GCACATTACA TATCGG ATCT 2160 
AAGAGATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGGAGATAO 1 1 1 rGGTTTG 2220 

75 TACAGAACAA AATAATATGT AGCAGCTTCA TTGCTATTGG AAAAATCAGT TATTGGAATT 2280 
TCCACTTAAA TGGCTATACA ACAATATAAC TGGTAGTTCT ATAATAAAAA TGAGCATATG 2340 
1 TCTGTTGTG AAGAGCTAAA TGCAATAAAG TTTCTGTATG GTTGTTTGAT TCTATCAACA 2400 
ATTGAAAGTO TTGTATATGA CCCACATTTA CCTAGTTTGT GTCAAATTAT AGTTACAGTG 2460 
AGI rGTTTGC TTAAATTATA GATTCCTTTA AGGACATGCC TTGTTCATAA AATCACTGGA 2520 
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TTATATTGCA GCATATTTTA CATTTGAATA CAAGOATAAT GGGmTATC AAAACAAAAT 2580 
OATGTACAGA TTTTTTTTCA AO I I I I 1ATA GTTGCTTTAT GOCAGAGTGG TTTAOCOCAT 2640 
TCACAAAATT TCTTATOCAT ACATTGCTAT TO AAAATAAA ATTTAAATAT TTTTTCATOC 2700 
^ TGAAAAAAAA 

SEQ m H&IM PFA1 Protein wniienctt 
Pratdn Accession #: NP.004353L1 

10 1 11 21 31 41 51 
I I I I I I 

MHFQAFWLCL GLLFISINAE FMDDDVBTED FEENSEEIDV NESELSSHK YKTPQPIGEV 60 
YFAETFDSGR LAGWVLSKAK KDDMDEEISI YDGRWEEEL KENQVPGDRQ LVLKSRAKHH 120 
AISAVLAKPF IFADKPUVQ YEVNFQDGID CGGAYKLLA DTDDLUJENF YDKTSYIIMF 180 
15 GPDKOGEDYK LHFIFRHKHP KTCVEEEKHA KPPDVDLKKF FTDRKTHLYT LVMNPDDTFE 240 
VLVDQTWNK QSLLEDWFP KPPKEJEDP NDKKPEEWDE RAKIPDPSAV KPEDWDESEP 300 
AQD3DSSWK PAGWIDDEPK F1FDFNAEKP DDWNEDTDGE WEAPQILNPA CRIGCQEWKP 360 
PMD3NPKYKQ VWRPPLVDNP NYQGIWSPRK IPNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
„ DIYFDNFUC SEKEVADHWA ADGWRWKIMI ANANKPGVLK QLMAAAEGHP WLWLIYLVTA 480 
20 GVHALITSF CWPRKVKKKH KDTEYKKTDI OPQTKGVLE QEEKEEKAAL EKPMDLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEEIEUEGQ EESNQSNKSG SEDEMKEADE STGSGDGPHC 600 
SVRKRRVRKD 



25 SEO ID N&ISIPEZSONA SEQUENCE 

Nucleic Add Accession f. NML005932 

Coding sequence: 75-2216 (undenlaedse(^en(^ correspond la start anJ stop codora) 
1 11 21 31 41 51 

30 | | | | | | 

GCGGAGGGCG CGCTCCCAGC GAAAGCAGCA GGGCAGGGAT CIGCGTTGGA GGAAGGGACT 60 
GCTCTGGTGC TAGAATGCTG TGGGTCGGAA GGCTGGGOGO CTTGGOAGCC AOAGCAGCAG 120 
CTCTGCCGCC CCGCCGGGCG GGCCGGGGAA GCCTCGAAGC CGGGATCCGG GOCCGAAGGG 180 
- TCAGCACCAG CTGGTCTCCC GTGGGCGCXXj CCTICAATGT CAAGCCCCAG GGCAGCCGCT 240 

35 TGG ACCIGTT CGGCGAGCGG GCGCGTCTTT TIGG AGTTCC TG AGCTGAGT GCCOCAQAAG 300 
GATTTCATAT TGCACAAGAA AAAGOCTTGA GAAAGACAGA ATTGCTTGTG GACXX5TGCAT 360 
GTTGCACCCC ACCTGGGCCC CAGAOOGTGC TGATCTTCGA TGAGCTCTCG GATTCCTTAT 420 
GCAGAGTGGC CGACTTGGCT GATmGTGA AAATCGCTCA OCCTGAGCCA GCATTCAGAG 480 
AAGCTGCGGA AG AAGCTTGT AGAAGTATTG GCACCATGGT AGAGAAGTTG AACACAAATG 540 

40 TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTGATAA AAAACTTGTG GATTCCCTTG 600 
ATCCAGAAAC AAGGCGAGTG GCIGAACTGT TTATGTTTG A TTTTGAAATT AGTGGAATCC 660 
ATCTAGACAA ACAAAAGCGT AAAAGAGCAG TGGACCTCAA TGTTAAAATC TTGGATTTGA 720 
GTAGTACATT TCTTATGGGA ACCAATTTTC GCAACAAG AT TGAGAAGCAT CTCTTAGCAG 780 
AACACATTCG TCGTAACTTT ACATCTGCTG GGGATCATAT CATAATTGAT GGTCICCACG 840 

45 CAGAATCACC AGATGACTTG GTGCGAGAAG CIGCITATAA A A111 ITCTT TATCCCAATG 900 
CTGGTCAATT GAAATGTTTA GAAGAATTGC TCAGCAGCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CACGTTTTCT CACAGGGCTC TGCAAGGAAC GATAGCTAAA AATCCAGAGA 1020 
CTGTCATGCA GTTCCTTGAA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 

- . TTGAGATGAT ACGAGGGATG AAAATGAAAC TGAATGCTCA AAATTCCGAA GTAATGCCCT 1140 

50 GGGACCCCCC TTACTACAGT GGTGTGATTC GTGCAGAAAG GTATAATATT GAGCCCAGCC 1200 
TATATTGCCCU11 1 ITCTCT CTTGGAGCAT GCATGGAAGG CCTGAATATT TTGCTTAACA 1260 
GACIGTTGGG GATTTCATTA TATGCAG AGC AGOCTGCAAA AGGAGAGGTG TGG AGCGAAG 1320 
ATGTCCG AAA ACTGGCTGTT GTTCATG AAT CTGAAGG ATT GTTGGGGTAC ATTTACTGTG 1380 
ATTTTTTTCA GCG AGCAG AC AAAOCACATC AGGATTGCCA TTTCACTATC CGTGG AGGCA 1440 

55 GACTAAAGGA AGATGGAGAC TATCAACTCC CACITGTAGT TCTTATGCTG AATCTTCOGC 1500 
GTTCCTCAAG GAGTTCTCCA ACTTTGCTAA CTOCTGGCAT GATGGAAAAT CTTTTCCATG 1560 
AAATGGGACA TGCCATGCAT TCAATGCTAG GAOGTACTCG TTACCAACAC GTCACTGGG A 1620 
CCAGGTGCCC TACTGATTTT GCTGAGGTTC CTTCTATTCT GATGGAGTAC TTTGCAAATG 1680 

, ATTATCGAGT AGTTAACCAA TTTGCCAGAC ATTATCAGAC TGGACAGCCA CTGCCAAAAA 1740 

60 ATATG GTGTC TCGTCTTTGT GAATCTAAAA AGGTTT GT GC TGCAGCTGAT ATGCAACTTC 1800 
AGGTCTTTTA TGCCACTCTG G ATCAAATCT ACCATGGGAA GCATCCCCTG AGGAATTCAA 1860 
CCACAG ACAT TCTCAAGGAA ACACAAGAGA AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGGCA GCTGCGATTC AGCCACCTCX! TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
TCATGTCCAG AGCGGTCGCC TOCATGOTTT GGAAGGAGTG TTTTCTACAG GATCCTTTCA 2040 

65 ACAGGGCTGC CGGGGAGCGC TATOGCAGGG AGATGCTGGC CCACGGTGGA GGCAGGGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTGATGAC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGGACTTCXJ AAACTTTCCT CATGGATTCT GAATAAAAGA 2220 
AACACTCTAC AGCTCTAATC AAGGTCATGT AGTAATGACT TTGTTATAAA TGCTACAGCT 2280 
GTGAGACCTT GTTTCTGATT GTTTCATTGT TOGCTTCTOT AATTCTGAAA AACTTTAAAC 2340 

70 TGGTAGAACT TGG AATAAAT AATTTGTTTT AATTAAAAAA AAAAAAAAAA AA 



75 



SEQ ID Wft152 PEZ9 PnHeln Mmience: 
Protein Accession I: KP.005923.1 

1 11 21 31 41 51 
I I I I I I 

MLCVGRUGGL GARAAALPPR RAGRGSLEAO IRARRVSTSW SPVGAAFNVK FQGSRLDLFG 60 
ERARLFGVPE LS APEGFMA QEKALRKTEL LVDRACSTPP GPOTVUFDE LSDSLCRVAD 120 
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LADFVKIAHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 180 
RVAELFMFDF EISGIHLDKQ KRKRAVDLNV HLDLSSTFL MGTNFPNKIE KHLLPEHIRR 240 
NFTSAGDHD IDGLHAESPD DLVREAAYKI FLYPNAGQLK CLEELLSSRD LLAKLVGYST 300 
FSHRALQGTt AKNPETVMQF LEKLSDKLSE RTUCDFEMK GMKMKLNAQN SEVMPWDPPY 360 
YSGVIRAERY NIEPSLYCPF FSLG ACMEGL NILLNRLLGI SLY AEQPAKO EVWSEDVRKL 420 
AWHESEGLL GYIYCDFFQR ADKPHQDCHFTIRGGRLKED GDYQLPLWL MLNLPHSSRS 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFABVPSILM HYFA NDYRW 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHOKH PLRNSTTDIL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVGYGA RYYSYLMSRA VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVSALVSO LDLDFETFLM DSB 



SEQ 10 N0:1S3 PEZB DNA SEQUENCE 

Nucleic Acid Accession t. AF1O3307 

Coding sequence none (undenlned sequences correspond la statand stop codons) 



1 U 21 31 41 51 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAOAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGGAGACC AGGAAGATCT GCATGGTOGG AAGGACCIGA TGATACAGAG 120 
GAATTACAAC ACATATACITAGTGT1 1 CAA TO AACACCAA OATAAATAAG TGAAGAGCTA 180 
GTCCGCTOTO AGTCTCCTCA GTG ACACAGO GCTGGATCAC CATCGACGGC ACTTTCTG AG 240 
TACTCAGTGC AGCAAAGAAA GACTACAGAC ATCTCAATGG CAGGGGTGAG AAATAAGAAA 300 
GOCTGCTGAC TTTACCATCT GAGGOCACAC ATCTGCTOAA ATG GAGATA A TTAACATCAC 360 
TAGAAACAGC AAGATGACAA TATAATGTCT AAGTAGTGAC ATGTTTTTGC ACATTTOCAG 420 
CCCCTTTAAA TATCCACACA CACAGGAAGC ACAAAAGGAA GCACAGAGAT GCCTGGGAGA 480 
AATGCCCGGC CGCCATCTTG GGTCATCGAT GAGCCTCGOC CTGTGCCTGG TCCCGCTTGT 540 
GAGGGAAGGA CATTAGAAAA TGAATTGATG TGTTCCTTAA AGGATGGGCA GGAAAACAGA 600 
TCCTGTTGTG GATATTTATT TOAACGGG AT TACAG ATTTG AAATG AAGTC ACAAAGTGAG 660 
CATTACCAAT GAGAGGAAAA CAGACGAGAA AATCTTGATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCAAGCT GGGGAGGAGA TAACCACGGG 780 
GCAGAGGGTC AGGATTCTGG CCCTGCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC AAAGCTGTTG T AATATC TGA TCTCTACGGT TCCTTCTGGO 900 
CCCAACATTC TCCATATATC CAGCCACACT CATTTTTAAT ATTTAGTTCC CAGATCTGTA 960 
CTGTGACCTT TCTACACTGT AGAATAACAT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCTAAT ATGTAGCTGA CTGTTTTTCC TAAGGAGTGT TCTGGCOCAG GGGATCTGTG 1080 
AACAGGCIGG GAAGCATCTC AAGATCTTTC CAGGGTTATA C TTACT AGCA CACAGCATGA 1140 
TCATTACGGA GTGAATTATC TAATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTCOCA CTTTTGTGCC CATTCTCAAG ACCTCAAAAT GTCATTCCAT TAATATCACA 1260 
GGATTAACTT TTTTTTTTAA CCTGGAAGAA TTCAATGTTA CATGCAGCTA TGGGAATTTA 1320 
ATTACATATT TTGTTTTCCA GTGCAAAGAT GACTAAGTCC TTTATCCCTC CCCTTTGTIT 1380 
OA TTTT 1 1 1 1 CCAGTATAAA GTTAAAATGC TTAOCCTTGT ACTGAGGCTG TATACAGCAC 1440 
AGCCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAACAAAATC TAACTTGTAA TTCCTTGAAC ATGTCAGGAC ATACATTATT CCTTCTGCCT 1560 
GAGAAGCTCT TCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTG AAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAGAAG GGACACATAT GAGATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAGAGTTTA GATAAATATA TGAAATGCAA GAGCCACAGA 1740 
GGG AATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAGAATTC ATGCAGTGCA AATCCCCAAA GGTAACCTTT ATCCATTTCA TGGTGAGTGC 1920 
GCTTTAGAAT TTTGGCAAAT CATACTGGTC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1980 
TTGTAGTTAA TTGAAAGAAA TAGGGCACTC TTGTGAGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAG AAT TTACAAAGAG CTACTCAGGA CCAGTTGTTA AGAGCTCTGT GTGTGTGTGT 2100 
GTG1G 1GTGT GAGTGTACAT GCCAAAGTGT GCCTCTCTCT CTTGACCCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TGAGCTGCCA ATOATGTATC ACCACCATAT 2220 
CTCATTATTC TCCAGTAAAT GTG ATAATAA TGTCATCTGT TAACATAAAA AAAGTTTGAC 2280 
TTCACAAAAG CAGCTGG AAA TGG ACAACCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACATA TATTGTTAG A AGCACCTCGC ATTTGTGGGT TCTCTTAAGC 2400 
AAAATACTTG CATTAGGTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTGAATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAGAGGATTC AGACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TOCCTCTTTG 2580 
TGTTCATGGA TAGTOCAATA AATAATGTTA TCTTTGAACT OATGCTCATA GGAGAGAATA 2640 
TAAGAACTCT GAGTGATATC AACATTAGGG ATTCAAAGAA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GGAACCAAGA TACAAAGAAC TCTGAGCTGT CATCGTOCCC ATCTCTGTGA 2760 
GOCACAACCA ACAGCAGG AC OCAACGCATG TCTGAGATCC TTAAATCAAG GAAACCAGTG 2820 
TCATGAGTTO AATTCTCCTA TTATGGATGC TAGCTTCTGG CCATCTCTGG CTCTCCTCTT 2880 
GACACATATT AGCTTCTAGC CTTTGCTTCC ACGACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCTCTCTGCT CTGTTGCTTT GGACTTCCCC ACAAGAATTT CAACGACTCT 3000 
CAAGTCTTTT CTTCCATCCC CAOCACTAAC CTGAATGCCT AGACCCTTAT TTTTATTAAT 3060 
TTCCAATAGA TGCTGCCTAT GGGCTATATT GC TTTAO ATO AACATTAG AT ATTTAAAGCT 3 1 20 
CAAGAGGTTC AAAATCCAAC TCATTATCTT CT(J 11 rCTTT CACCTCCCTG CTCCTCTCCC 3180 
TATATTACTG ATTGCACTGA ACAGCATGGT CCCCAATGTA GCCATGCAAA TGAGAAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAG A CTGCTOAAGC CAGAAGG ATG ACTG ATTACG 3300 
CCTCATGGGT GGAGGGGACC ACTCCTGGGC CTTCGTGATT GTCAGGAGCA AGACCTGAGA 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCT TTCT A ATGAAGATCC ATAGAATTTG 3420 
CTACATTTG A G AATTCCAAT TAGOAACTC A CATGTTTTAT CTGCCCTATC AATTTTTTAA 3480 
AC I r GCT GAA AATTAAGTTT TTTCAAAATC TU 1CC1 1GTA AATTACTTTT TCTTACAGTG 3540 
TCTTGGCATA CTATATCAAC TTTGATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 
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AAAGTGGCTTTTA T TCTCTT TATTATTATT Al ITTCm 1 ACTACTATAT TACGTTOTTA 3660 
TTA1 1 1 1U1T CTCTATAOTA TCAATTTATT TOATTTAGIT TCAATTTATT TTTATTGCIO 3720 
ACTTTTAAAA TAAGTGATTC GGGGOGTGGG AOAACAGGOG AGOG AGAGCA TTAGG ACAAA 3780 
TACCTAATGC ATCTOGOACT TAAAAOCTAO ATGATGGGTT GATAGGTGCA GCAAACCACT 3840 
5 ATGGCACACG TATACCTGTG TAACAAACCT ACACATTCTG CACATOTATC CCAGAACGTA 3900 
AAGTAAAATT TAAAAAAAAQ TGA 

lu Protein Accession #: none 

SEQ ID KO-.164 PEZ6 DNA SEQUENCE 

Nudetc Add Accession I: AB0Z8S4S 

Coding sequence: 1-3765 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

AJgATGATGA ACQTCCCCGG CGOAGGAGCO GCOGCGGTOA TGATGACGGG CTACAATAAT 60 
GGTCGCTGTC CCCGGA A TTC TCTCTACAGT GACTOCATTA TTGAGGAGAA GACGGTGGTC 120 

20 CTGCAGAAAA AAOACAATOA GGGCfTTOOA T T OGTCCT tC GAGGGGOCAA AGCTGACACA 180 
CCCATTGAAG AATTCACACC AACACCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTO 240 
GATGAAGGTG GGGTGGCGTG GCAAGGCGGA CTAAGGACCG GGGACTTCTT GATTGAGGTt 300 
AACAATG AGA ATGTTGTCAA AGTCGGCCAC AGGCAGGTGG TGAACATGAT OCGGCAGGGA 360 
GGGAATCAOC TGGTCCTTAA GGTGGTCACG GTGACCAGGA ATCTGGACCC CGAOGACACC 420 

25 GCCAGGAAGA AAGCTOOOOC GCCTOCAAAG CGGGCACCX3A CCACAGCCCT CAOCCTGCGC 480 
TCCAAGTCCA TOACCICGGA GCTCGAGGAO CTCGTGGATA AAG ATAAACC CGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC CCGCGCTGCT GAGAACATGG CTGTGGAACC GAOGOTGGOG 600 
ACCATCAAGC AGOGOCCCAG CAGOCGGTGC TTCOCGGCGG GCTCAGACAT GAACTCTGTG 660 
. TACGAACGCC AAGGAATOGC CGTGATGACG CCCACTGTTC CTGGGAGCCC AAAAGCCCCG 720 

30 TTTCTGGGCA TOCCICGAGG TACGATGCGA AGGCAGAAAT CAATAGACAG CAGAATCTTT 780 
CTATCAGGAA TAACAGAGGA AGAGCGGCAG TTTCTGGCTC CTOCAATGCT □AAOTTCACC 840 
AGAAGCCTGT CCATGCCGOA CACCTCTGAG GACATCCCCC CTOCACCGCA GTCTGTGCCC 900 
CCGTCCCCAC CACCACCTTC CCCAACCACT TACAACTGCC CCAAGTCCCC AACTCCAAGA 960 
GTCTACGGG A CGATTAAGCC TGCGTTCAAT CAG AATTCTG CCGCCAAGGT GTCCCCCGCC 1020 

35 ACCAGGTCCG ACACCGTGGC CACCATGATG AGGGAGAAGG GGATOTACTT CAGGAGAGAG 1080 
CTGGACOGCT ACTCCTTGGA CTCTGAAGAC CTCTACAGTC GGAATGCCGG CCCGCAAGCC 1140 
AACTTCCGCA ACAAG AGAGG CCAGATGCCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG CCGTCTACGT OCCCGCCAAG CCCGCCAGGC GGAAGGGGAT GCTGGTGAAG 1260 
CAGTOCAACG TGQAGG ACAG CCCCG AGAAG ACGTGCTCCA TCOCTATCOC GACCATCATC 1320 

40 GTGAAGGAGC CGTCCACCAG CAGCAGCGGC AAGAGCAGOC AGGGCAGCAG CATGGAGATC 1380 
GACCCCCAGG COCCGGAGOC ACCGAGCCAG CTGCGGCCTG ACGAAAGCCT GACCGTCAGC 1440 
AGCCCCnTG CCGCCGCCAT CGCCGGAGCC GTCCGCGACC GTGAGAAGCG GCTGGAAGCC 1500 
AGGAGGAACT CCCCGGCCTT OCTCTCCACA GACCTCGGGG ATGAGGATGT GGGCCTGGGG 1560 
CCACCCGCCC CCAGGACGCG GCCCTCCATG TTCOCCGAGG AGGGGGATTT TGCTGACGAG 1620 

45 GACAGOGCTG AGCAGCTGTC ATCCCCCATG CCGAGTGCCA CGCCCAGGGA GOCCGAAAAC 1680 
CATTTCGTGG GTGGCGCCGA GGCCAGTGCT CCGGGTGAGG CTGGGAGGCC GCTGAATTCC 1740 
ACGTCCAAAG CCCAGGGGCC CGAGAGCAGC CCAGCAGTCC CCTCCGCGAG CAGCGGCACA 1800 
GCCGGCOCCG GGAATTATGT OCACCCACTC ACAGGGCGGC TGCTTGATCC CAGCTCCCCG 1860 

„ CTGGCCCTGG CACTCTCCGC AAGGGACCGA GCCATG AAGG AGTCTCAACA GGGAGCCAAA 1920 

50 GGGGAGGCCC GCAAGGCGGA CCTCAACAAA CCTCTTTACA 7TGATACCAA AATGCGGCCC 1980 
AGCCTGGATG COGGCTTCCC TACGGTCACC AGGCAGAACA CCCGGGGACC CCTGAGGCGG 2040 
CAGG AG ACGG AGAACAAGTA CQAGACCGAC CTGGGCCG AG ACCGGAAAGG CGATGACAAG 2100 
AAGAACATGC TGATCGACAT CATGGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATG 2160 
GTGCACACCG TGGACGCCAC TAAGCTGG AC AACGCCCTGC AGGAAG AGGA CGAGAAGGCA 2220 

55 GAGGTGGAGA TOAAGCCAGA CAGCTCGCCG TCCGAGGTGC CAGAAGGTGT TTCCGAAACC 2280 
GAAGGTGCTT TACAGATCTC CGCTGCCCCC GAGCCCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGGA AGAGGCGGTG ATTTTGCCAT TCCGCATOCC TCCTCCCCCT 2400 
CTGGCATCCG TGGACTTGGA TGAGGATTTT A 1 1 ITT ACAG AGCCATTGCC TCCTCCCCTG 2460 
GAATTTGCAA ATA G 1 1 1 1G A TATCCCCG AT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 

60 GACTTAGTGA AGCAGAAGAA AAGCGACACC CCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAGACAG CAAGAAGCCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TICCTGCCAC CCCCTGAAAG CmGACGCC GTCGCCGACT CTGGGATCGA GGAGGTGGAC 2700 
AGCCGGAGTA GCAGCGACCA CCACCTCGAG ACGACCAGCA CTATCTOCAC CGTOTCTAGC 2760 
ATCTCCACCC TGTCTTC CGA AGGTGGAGAG AATGTGGACA CCTGCACAGT CTATGCAGAT 2820 

65 GGGCAAGCAT TTATCGTW A CAAACCCCCA GTACCTOCTA AGCCAAAAAT GAAGCCCATC 2880 
ATTCACAAAA GCAATGCACT TTATCAAGAC GCGCTCGTGG AAGAAG ATGT AGATAGCTTT 2940 
GTTATCCCCC CGCCCGCTCC COCGCCGCCG CCGGGCAGTG CCCAGCCTGG GAIGGCCAAG 3000 
GTTCTCCAGC CAAGGACCTC CAAGTTGTGG GGCOACGTCA CAGAGATCAA AAGCCCGATT 3060 
CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3120 

70 CGAG AG AAAT TGGCAAAGCC GGGGGAAGG A CTOG ATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGCCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCCC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATG AAA GCAGGACCTC AGO AACAAGA CGTGCCCCAA GCCCTGTGGT CTCGCCAACA 3360 
GAGATGAACA AAGAGACCCT GCCCGCCCCC CTOTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 

75 GCTCTCTCAG ATGTCTTTAG CCTTCCAAGC CAGCCCCCTT CTGGGG ATCT ATTTGGCTTG 3480 

AACCCAGCGG GACGCAGTAG GTCGCCATCC CCCTCGATAC TGCAACAGCC AATCTCAAAT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTO TGGACTAAAC CAGATGTGGC CGATTGGCTG 3600 
GAAAGTCTAA ACTTGGGTGA ACATAAAGAG GCCTTCATGG ACAATGAGAT CGATGGCAGT 3660 
CACTTACCAA ACCTGCAGAA GGAGGACCTC ATCGATCTTG GGGTAACTCG AGTCGGGCAC 3720 
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AGAATGAACA TAGAAAGGGC TTTGAAACAG CTGCTGGACA GATAAGGACG GCTGCTCTCC 3780 
ACCTCGCAGA CTGCTCTTGT TATAAGTAGA GATGGGCTGG TGCTGAAACA TCTGAATGCC 3840 
AAGCGAAGTC TGTGAGCATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCAAAGAAA 3900 
TACTGAGTTG TGTCCACAAC ATGGCTGGGT CTTCAGACCC CTGGCTCACC ATGTGGGTGT 3960 
CTTGGGCAGT TTCTATCACA CATGGGACAA GGGG AGGGAG TTTTTCTAAC ATGG AAAAAO 4020 
A1TCOCAGCC TGCOGCCCAG CATGCAGGTG GOCTCGCTTT GCCGGGTCCG AGAGGCTCCC 4080 
OGTCAATnT GCACGGG ATC CTAGCICTIG TAGGCAGACA CCAGTGCACT CTAGATACCT 4140 
CCTOAGACCT CCGTCCTCTG CTTTOOGGGC AGCTCTCACC ACCCCAGGCC CCGGCATGAG 4200 
GCCnTCCTC AGTCCTGTGG CCTCTCAGAG GACACCTOAT GCTCACCTGC CCCTCTTTCT 4260 
OCraCACTTO GCTTGCAGTG AGATGCTCCC AG ATGCATTT GTCCAGTGCC CCATCATGGG 4320 
OCIGAAAGGC AGAGAAA C1 1 TTTCCTACAC AGATTCmT CCCCATCTCC TCCTGTGGTT 4380 
TGCATCCATG GCTCTTTGGC CATGAGGTTC CTGGCAGTGC TGGGAGTTTG GATGGGATCG 4440 
TGCCCAGCTT TGCTTAG CTT TCT1 IATTTC TGCAAATCTO TTAGCATAAT TCCAAGGTGG 4500 
CCAAGCAGAT GTCACATGGA GTTAGTCAAA GCACAAAGTC ACOATTCCAC AATGGAGGGG 4560 
AGACCTGGCC AAGGGAGCCA GOCAGCQTGC AACTGCCCAA GCTOCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATG AG CACCCATCCA GGATGGAGAA TAAGGGCTTC TCTGCCTCTC 4680 
AGAATTCTTT TTAATTOAAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT GAAGGACAAG AAGACGCATG GCTCATGGCG GGCACATGCG GCTGCCAGTG 4800 
AGACAGCGTC TGCTCTGGGA GCTGGGCGGG CACAGCATGC TCAGTTCTGT GCCCAGCCAA 4860 
GGGTGAGCAT CTCTGCTGAG ACAGTOCTTT TGCTCIOGGA GGCCAGGGAA GATGGTACTT 4920 
AG AGGCTTTT CCCCTATCGC TCTGGGTGTC TAGGAATCCC AGCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGGACC CAGTGGOf AT GGAGTATAGA CAGAACCCAG GGTTGAGAAC 5040 
AGAAGGTGGG CGGCAGGATC AGAGTGAAAG CAGAGGCGTO AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGGC TGCCAGGTCA GCCTCTCTGG CAAGGC 1 1 IC TTGAGCOCCG COCCTTTCTT 5160 
TCO0CGGAGT CCCTCCACCC CATAACAATA CCTCQAATTT CCAAAAGAGG TCACCAGATG 5220 
CACATGGGCC GCAAAACACA CAGTCAGGCT TCCAGCACAT TCTCCCCCAT TTGGAGG ATA 5280 
CTCGAATGTC AGGTTTTTGG TTTTATTATT ATTTCAGA AC TAGCTCAGCC CATCTCTAAT 5340 
TATAAAACAT GGTTTTGTI I It'll 111 1 lCCin'l'll' lUVT GATTAGGTCTG OAAC AOCT 5400 
CTAGAATGAA CACATAAAAT TTAGCAATTT AAAATC 1 TO TI I ACTGCAA GTTTAAATAG 5460 
TTGTACAGAT AGTTTATAAG CACAATATTT TAAGAAAAAA AAGTGGCTGG TCTACTAGGC 5520 
AGCCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAGAAAAA AAAACTTTTG TGATTTAATA 5580 
ATACTATTTC TGTGGAATAA TTATAAAAGT ATG ACCTTTT TAAATCAAGC TTATTTGGAT 5640 
GCATCTGAAC CAGCAGAGCT GTGTTATATT TTCTATCTTT GCTAGAACTT GGTCATTGAA 5700 
GGACAATTTC TTCAAAGTGG TTACAATTCA TAATGCAGCA GTTTCTOCAA AAACAAAAAC 5760 
AAAACACACA GCACACACAC GCGCTTTTOC AGTCACACAC CCCTGATGTT GG AACCAAGT 5820 
TTTTGGACCr TCTGTTCCAA AACCTTTTGC AGGTCAATCT TTGTAnTGA AATG ATCCAA 5880 
TCCAACTTGA AGTCAATTGA ATATTAAGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATGAGAT GAATG AGCAT TACTCTAQAT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
TGCTGAGGCT GCCCCATATT TTAGAAAATG TAAAAATGGT GGTTTGGCCA TTAATTTGTC 6060 
TTCCATTTGA TGATACCGCA AAATTCCGTG AGTCCATTCC TTTGGCATGG CACTTTCCCT 6120 
GGGCCTACAG TTGGTATTAC CTCTGTGCTC AGTGGCAGGC AAAACACTAG CTCAAAGGAG 6180 
AGTCAAGGAA ACCGCTGGCA GACGATAACC AGTCGAAACT CGTGACTTCG GTTTGTTGAA 6240 
CTTTGGCAGC CAGTTGGTGA GGGCCAG ATG TTATTCCCTT TCTTAAAGAT ACTCCAAGCC 6300 
ACATGCCACT AACCACAAGC AAGCTGGCTG CAAGACTAAA GAGCTGATAA CATAGTTTAT 6360 
TTTTACACTG TCTTATTATA GAG AAGTAAT AQACCTATCA GAACCTGCAC TGACCAACAA 6420 
ATAAACACAT GTTGCCAAGA TGAATCGGTC TCTATCTCTA TCTGCTTATT TTGGTACTGA 6480 
AAGCAATAGT TCCTCATTCA AATCACCACC CACTG1TCTC CCCCTTTGGG ACATGTTAGG 6540 
ACGAGGCCCT ATTCCATGCC CCTCTTTAATGGTGGAACAA ATGTTAAACT GCTCATCTAA 6600 
AGATCATGTT GATATTATTC CAGGTTTTAA GATCAACTTT TGTTACATAC TGTAATTTAA 6660 
ATAAACTGCA TTTACATGCC TAGTTTCTGT AATATTGTGT ATACAAAACC CAAAT CTCTC 6720 
AAAATOTAAA TTATGTATAC CTGCCAAGAT ACCTnTCCA GGGTGTCTGC GCACATTTTA 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTGACTGT TGATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TGATACTTTA GTTAATAAAA TGGTAAATTT TTTTCTCAGT 6900 
TATTGAACAA GCAAGCATTA TCCAGTTGAT CTGGCAATGA CTnTTGTGT GTGGGCCACA 6960 
ATATTG ATTT TCCCATTAAC AA 1TI I ITTT TO rTTTTTAA ATACTAATAT GTTTCACACT 7020 
ATAGTTTGTG TAACAACACG TGTTOGCATT ATCTATGTTG CTOTTACTTT TOTGCTTTTA 7080 
TTCTTTTTAG ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTT TCTCCCAATC 7140 
CTTAAATCTC TTGTATGGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGCACTA 7200 
AGGTTGTGGT TCTGATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTQACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTCCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTCCATT TCCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAG AATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTGACGG CTGCTATAGG CGTGGAGCTG AGGCTCGGCT 7500 
TTTCCTTTTG TTCTGGGTGG AAGCAGCGGT GCCGCGGAGG GCCAGCCAGA TCCGGACCCT 7560 
TCCCTTAGGG TCCAGTCTCC CCACACCXXA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
GAGTGGCAGA ACTGGGCCGC CTCTCTGGTT GACAAGCAAA CCACATGCTA AGGCTTGGAG 7680 
CAAG AGAOAA TTTGTGTCTA TTGGCAAAGA ACTAAGOCAG GAAGACATGG GCCATCOCTC 7740 
CGCTTTAGGG AAGCATATTT TAAACCTAAA CGTTGAACTT CTTCTTTGGC CTCACCAGTG 7800 
AAAAC1 1GTT GTCTTTAGTT CCTAAAGTTT CTTCTACTTT GGCACATTOC CCAGTTGAGC 7860 
AGCAGCCTCT ATGCTTCCAC GTTCAGGAAA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
CCCTCAAGCT CTCCCGCTTC ACCATOCAAT AGTTTCTCCC AAACCTTGGC ACCCCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTOCAGACCA LI 1 1'ICCTAG ATGAATATAT TCGTT TACCT 8040 
TACTAGGAAA ATTATTGGAA GATTTTTTCT TTTACTTGAA ATTGOAGGCA TTTTAATAAC 8 100 
TGGOG AACTG G AATGTOTTT CTGTATTTGT AG ACAACCAT GTACCCATGC AAGTAGGTGA 8160 
ACATTCCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CCCGTGCTTT 8220 
GTGGGGAATC AG AG AATTTC CAAACTTGTT TCTCAGACTT CCGCAGATCT CATCACTTTG 8280 
ATTTCTAATC CATGCTGTAT TGGTGATTTT GTTTATCGTT CCTGTAACTT GTTCTACATT 8340 
CCACAGTCTT TACCGTTTTA TGTTCAAAAT TACAACAATC OCTGTCCATT G ATTCCACTC 8400 
TGGAACTCTT TGTTCATGCC AATTTTGAAA TTTTAATACG AGCCTTCAAA TAAACACAGA 8460 
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AAAG AAAAAA AAAAAAAAAA AAAAAAAA 



SEP ID HO:165PEB Protein sequence 
Protein Accession!: BAAB2974.1 



1 11 21 31 41 51 
I I I I I I 

MMMNVPGGGA AAVMMTOYNN GRCPRNSLYS DCHEEKTW LQKKDNEGPG FVLROAKADT 60 
PlfchhlVlP A FPALQYLESV DEGGVAWQAG LRTGDFUEV NNENWKVGH RQWNMRQO 120 
CNHLVLKWT VTRNLOPDDT ARKKAPPPPK RAPTTALTLR SKSMTSELEB LVDKDKPEH 180 
VPASKPSRAA ENMAVEPRVA TIKQRPSSRC FPAGSDMNS V YERQGIAVMT PTVPGSPKAP 240 
FLGD7RGTMR RQKSIDSR1F LSGtlEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQS VP 300 
PSPPPPSPTT YNCPKSPTPR VYGTIKPAFN QNSAAKVSPA TRSDTVATMM REKGMYFRRB 360 
LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGK1 ASKAVYVPAK PARRKGMLVK 420 
OSNVEDSPEK TCSfflPTIl VKEPSTSSSG KSSQGSSMH DPQAPEPPSQ LRPDESLTVS 480 
SPFAAAIAGA VROKEKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADE 540 
DS AEQLSSPM PS ATPREPEN HFVGGAEAS A PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 
AGPGNYVHPL TGRIXDPSSP LALALSARDR AMKESQQGPK GEAPKADLNK PLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMUDMDT SQQKSAGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEGVSET EGALQISAAP EPTTVPGRH 780 
VAVGSMEEAV ILPFRIPPPP LASVDLDEDF IFTEPLPPPL EFANSFDIPD DRAASVPALS 840 
DLVKQKKSDT PQSPSLNSSQ PTNSADSKKP ASLSNCLPAS FLPPPESFDA VADSGEEVD 900 
SRSSSDHHLB TTSTISTVSS BTLSSEGGE NVDTCTVYAD GQAFMVDKPP VPPKPKMKPI 960 
IHKSNALYQD ALVEEDVDSF VIPPPAPPPP PGSAQPGMAK VLQPRTSKLW GDVTEKSPI 1020 
LSGPKANV1S ELNSDJQQMN REKLAKPGEG UJSPMGAKSA SLAPRSPHM STISGTRSTT 1080 
VTFTVRPGTS QPITLQSRPP DYESRTSGTR RAPSPWSPT EMNKETLPAP LS AATASPSP 1140 
ALSDVFSLPS QPPSGDLFGL NPAGRSRSPS PSILQQPISN KPFTTKPVHL WTKPDVADWL 1200 
ESLNLGEHKE AFMDNEJDGS HLPNLQKEDL IDLGVTRVGH RMNERALKQ LLDR 



SEQ D N&1G6 PEZ4 DNA SEQUENCE 

Nucleic Add Accession!: NM.000024 

Coding sequence: 220-1461 (undenTned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

ACTGCGAAGC GGCTTCTICA GAGCACGGGC TGGAACTGGC AGGCAOOGCG AGGCCCTAGC 60 
ACCCGACAAG CTGAGTCTGC AGGACGAGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGOGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTCCG CCCGCTG AGG 180 
CGCCCCCAGC CAGTGCGCTT ACCTGCCAGA CTGCGCGC CA TGG GGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAGAAGC CATGCGCCGG ACCACGACGT CACGCAGCAA 300 
AGGGACGAGG TGTGGGTGGT GGGCATGGGC ATCGTCATGT CTCTCATCGT CCTGGCCATC 360 
GTGTTTGGCA ATCTGCTGGT CATCACAGCC ATTGCCAAGT TCGAGCGTCT GCAGACGGTC 420 
ACCAACTACT TCATCACTTC ACTGGCCTGT GCTGATCTGG TCATGGGCCT GGCAGIGGTG 480 
CCCTTTGGGG CCGOOCATAT TCTTATGAAA ATGTGOACTT TTGGCAACTT CTGOTGOGAG 540 
TnTGGACTT CCATTGATCT GCTGTGCGTC ACGGCCAGCA TTGAGACCCT GTGCGTGATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAGAGCCT GCTGACCAAG 660 
AATAAGGCCC GGGTG ATCAT TCTGATGGTG TGGATTGTGT CAGGCCTTAC CTCCnCITG 720 
CCCATTCAGA TGCACTGGTA CCGGGCCACC CACCAGGAAG CCATCAACTO CTATGCCAAT 780 
GAGACCTGCT GTGAC TTCTT CACGAACCAA GCCTATXKXA TTGCCTCTTC CATCGTOTCC 840 
TTCTACGTrC CCCTGGTQAT CATGGTCTTC GTCTACTCCA GG U1C11 1 CA GGAGGCCAAA 900 
AGGCAGCTCC AGAAGATTGA CAAATCTGAG GGCOGCTTCC ATGTCCAGAA CCTTAGCCAG 960 
GTGGAGCAGG ATGGGCGGAC GCGGCATGG A CTCCGCAG AT CTTCCAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCAGCCT CTGCTGGCTO 1080 
CC LI1C11C A TCGTTAACAT TGTGCATGTG ATCCAGGATA ACCTCATCCG TAAGGAAGTT 1140 
TACATCCICC TAAATTGGAT AGGCTATGTC AATTCTGGTT TCAATOCOCT TATXTACTGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCCAG GAGCTTCTGT GCCTGCGCAG GTCTIClTTG 1260 
AAGGCCTATG GGAATGGCTA CTOCAGCAAC GGCAACACAG GGQAGCAOAG TGGATATCAC 1320 
GTGGAACAGG AGAAAGAAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGOTACTGT GCCTAG OGAT AACATTGATT CACAAGGOAO GAATTGTAGT 1440 
ACAAATGACT CACTCCTGIA^AGCAGTTTT TCTACnTTA AAGACCCCGC CCCCCCCAAC 1500 
AGAACACTAA ACAGACTATT TAACTTGAGG GTAATAAACT TAGAATAAAA TTGTAAAAAT 1560 
TGTATAGAGA TATGCAGAAG GAAGGGCATC CTTCTGCCI 1 1111 ATTTTT T7A AGCTGTA 1620 
AAAAGAGAGA AAACTTATTT GAGTGATTAT TTGTTATTTG TACAGTTCAG TTCCTCTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTAAAGAG CTTTAGTCCT AGAGGACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTA CCTC A CTATTCAAG TATTAGGGGT AATATATTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTOCTACA CCCTTGGACT TGAGGATTTT 1860 
GAGTATCTCG GACCTTTCAG CTGTGAACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 
ACACGGGGTA TTTTAGGCAG GGATTTGAOO AGCAGCTTCA GTT GTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGTAAATA AAATGTTTGA CCATG 



SEQ ID K0:1B7 PEZ4 Protein sequence: 
Protein Accession I: NP.000015.1 



I 11 21 31 41 51 
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MGQPGNGSAF LLAPNRSHAP DHDVTQQRDE VWWGMGIVM SLIVLAIVFG NVLVITAIAK 60 
FERLOTVTNY FITSLACADL VMGLA WPPG AAHILMKMWT FGNFWCEFWT SIDVLCVTAS 120 
IETLCVIAVD RYFAITSPFK YQSLLTKNKA RVHLMVWIV SGLTSFLPKJ MHWYRATHQE 180 
AINCYANETC CDFFINQAYA IASSIVSFW PLVIMVFVYS RVFQBAKRQL QKIDKSEGRF 240 
HVQNLSQVEQ DQRTGHGLRR SSKPCLKEHK AIJCTLGIIMG TFTLCWLPFF IVNIVHVIQD 300 
NURKEVYIL LNWIG YVNSG FNPUYCRSP DFRIAFQELL CLRRSSLKA Y GNGYSSNONT 360 
GEQSGYHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDNID SQGRNCSTND SLL 



SEQ ID N&168 PEZ1 DNA SEQUENCE 

Nude!: Add Accession* NMJ04457 

Coding sequence: 143-2303 (undaSned sequences eonespmd to start and slop radons) 



1 11 21 31 41 31 

GAATTCGTTG TTGGQAAGGA CTGGGGAAAC AGCTOTAACA TTTOCCACCC TCAGAAGCTO 60 
CTGGTCCTGT GTCACACCAC CTTAGCCTCT TGATCGAGGA AGATTCTCGC TG AAGTCTGT 120 
TA AntTTACT TTTTGAGTAC TTATOA ATAA CCA CQTG TCT TC AAAA CCAT CTAC CATQA A 180 
GCTAAAACAT ACCATCAACC CTATTCTTTT ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTAnTTA ACATACATTC CGTTTTATTT TTTCTCCGAG TCAAGACAAG AAAAATCAAA 300 
CCGAATTAAA GCAAAGCCTO TAAATTCAAA ACCTQATTCT GCATACAGAT CTGTTAATAG 360 
nTGGATGGT TTGGCTTCAO TATTATACCC TGGATGTGAT ACTTTAGATA AAGTTTTTAC 420 
ATATGCAAAA AACAAATTTA AGAACAAAAG ACTCTTGGGA ACACGTGAAG TTTTAAATGA 480 
GGAAGATGAA GTACAACCAA ATGGAAAAAT TTTTAAAAAG GTTATICTTG GACAGTATAA 540 
TTGGCTTTCC TATGAAOATG T CI I 'l Gl ICG AGGCTTTAAT TTTGGAAATG GATTACAGAT 600 
GTTGGGTCAG AAAOCAAAGA GCAACATCGC CATCTTCTGT GAGACCAGGG CCGAGTGGAT 660 
GATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGGAGGTCCA GCCATTGTTC ATGCATTAAA TGAAACAGAG GTGACCAACA TCATTACTAG 780 
TAAAGAACTC TTACAAACAA AGTTGAAGGA TATAGTTTCT TTGGTCCCAC GCCTGCGGCA 840 
CATCATCACT GTTGATGGAA AGCCACCGAC CTGGTCCGAC TTCCCCAAGG GCATCATTGT 900 
GCATACCATG GCTGCAGTGG AGGCCCTGGG AGCCAAGGCC AGCATGGAAA ACCAACCTCA 960 
TAGCAAACCA TTGCGCTCAG ATATTGCAGT AATCATGTAC ACAAGTGGAT CCACAGGACT 1020 
TCCAAAGGGA GTCATGATCT CACATAGTAA CATTATTGCT GGTATAACTG GGATGGCAGA 1080 
AAGGATTCCA GAACTAGGAG AGGAAGATGT CTACATTGGA TATTTGCCTC TGGCCCATGT 1140 
TCTAGAATTA AGTGCTG AGC TTGTCTGTCT TTCTCACGGA TGCCGCATTG GTTACTCTTC 1200 
ACCACAGACT TTAGCAGATC AGTCTTCAAA AATTAAAAAA GGAAGCAAAG GGG ATACATC 1260 
CATGTTGAAA CCAACACTGA TGGCAGCAGT TCCGGAAATC ATGGATCGGA TCTACAAAAA 1320 
TGTCATGAAT AAAGTCAGTG AAATCAGTAG TTTTCAACGT AATCIGTTTA TTCTGGCCTA 1380 
TAATTACAAA ATGG AACAGA TTTCAAAAGG ACGTAATACT CCACTGTGCG ACAGCTTTGT 1440 
TTTCCGG AAA GTTCGAAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGCGC 1500 
TCCACTTICT GCAAOCACGC AGCGATtCAT GAACATCTGT TTCTGCTGTC CTGTTGGTCA 1560 
GGGATACGGG CTCACTG AAT CTGCTOGOGC TGGAACAATT TCCGAAGTGT GGG ACTACAA 1620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGGA 1680 
AGGTGG ATAC TTTAATACTG ATAAGCCACA CCCCAGGGGT GAAATTCTTA TTGGGGGCCA 1740 
AAGTGTGACA ATGGGGTACT ACAAAAATGA AGCAAAAACA AAAGCTGATT TCTCTGAAGA 1800 
TGAAAATGGA CAAAGGTGGC TCTGTACTGG GGATATTGGA GAGTTTG AAC GCGATGGATG 1860 
CTTAAAGATT ATTGATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GTAGAGGCAG CTTTGAAGAA TCTTCCACTA GTAG ATAACA TTTGTGCATA 1980 
TGCAAACAGT TATCATTCTT ATGTCATTGG ATTTGTTGTG CCAAATCAAA AGGAACTAAC 2040 
TGAACTAGCT CGAAAGAAAG GACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTGA 2100 
AATGGAAAAT GAGGTACTTA AAG r G CTTTC CGAAGCTGCT ATTTCAGCAA GTCTGGAAAA 2160 
GTITGAAATT CCAGTAAAAA TTCGTTTGAG TOCTG AACCG TGGACCCCTG AAACTGGTCT 2220 
GGTGACAGAT GCCTTCAAGC TGAAACGCAA AGAGCTTAAA ACACATTACC AGGCGGACAT 2280 
TGAGCGAATG TATGGAAGAA AAT A ATT ATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATCAAA TAGGAAAATA CTTGAAATGC ATGTCTCAAG CTGCAAGGCA AACTCCATTC 2400 
CTCATATTAA ACTATTACTT CTCATGACGTCACCA 11111 AACTGACAGG ATTAGTAAAA 2460 
CATTAAGACA GCAAACTTGT GTCTOTCTCT TCTTTCATTT TCCCCGCCAC CAACTTACTT 2530 
TACCACCTAT GACTGTACTT GTCAGTATGA GAATTTTTCT GAATCATATT GGGGAAGCAG 2580 
TGATTTTAAA ACCTCAAGTT TTTAAACATG ATTTATATGT TCTGTATAAT GTTCAOTTTG 2640 
TAACTTTTTA AAAGTTTGGA TGTATAGAGG GATAAATAGG AAATATAAGA ATTGGTTATT 2700 
TGGGGG C 1 TT 1 TT ACTTACT GTATTTAAAA ATACAAGGGT ATTGATATGA AATTATGTAA 2760 
ATTTCAAATG CTTATGAATC AAATCATTGT TG AAC AAAAG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAGA GAAATAAATA TACCCATACT T ATGT TTTAA GAAGTTGAGA 2880 
TCTTGTGAAT ATATGCCTGT CAGTGTCTTC TTTATATATT TATTTTTTAT TAG AAAAAAT 2940 
GAAGTTTGGT TGGTG ATGCA TGAAACAAAA TAGCAAG AGA GGGTTATAGT TTAATAGTAA 3000 
GGGAG ATAAC AjCAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTGAAACTAG GAAAATACCC AAACTTAAAT 3120 
GGAAGAATTC TGAAAGAGAG GATAGAATTT AAAGAACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTGACTATAT GTACATTGAG TTATCTATAT TTGTAAACAA ATTAGTCATG 3240 
GAAAATTATT CTATTCCAAA GTCTCCTTTT AGTCTAGATA ATCATTATTT CATTTTAAAA 3300 
TTAGTGTTTT TCATAGTTTG CACTGATGCG TGTATGG ATG TGTGTGAGTC AGTGGTAGCT 3360 
TATTTAAAAA GCACCTTATC CTTTCTCCCA TAAOCTTTGT ACACTAAAAA ATGAAAGAAT 3420 
TTAGAATGTA TTTG ATG ATA GCATTCTCAC TAAGACACAT GAGAATTTAA CTTTATAACC 3480 
GCGTGAGTTA AGATTTAATT CATAGGTTTT GATGTCATTG TTGAAGTTAT TTGTAATTCA 3540 
GAAACCTTGC TTGTGTGATA CATAGTAAGT CTCTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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ATATCTCGAT TATCAAAAGC AATAGTGCAC CAATTAAGAT GTGCTCAAAT CAGOACTTAA 3660 
ATCATAGGCA CCACATTTTT CATGTCAGAC TAGTTACTTT GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAAATC A 



SEQH>NO:169 PEZ1 Proton seouencg 
Protein Accession ft NP_004443.1 



41 



SI 



1 II 21 31 
I I I I I I 

MNNHVSSKFS TMKLKHTINP UXYFIHFU SLYTILTYIP FYFFSESRQE KSNRKAKPV 60 
NSKPDSAYRS VNSLDGLASV LYPGCDTLDK VFTYAKNKFK NKRLLGTREV LNEEDEVQPN 120 
QKIFKKVILG QYNWLSYED V FVRAFNFGNG LQMLGQKPKT NlAffCETRA EWMIAAQACF 180 
MYNFQLVTLY ATLGGPATVH ALNETEVTNI ITSKELLQTK LKDIVSLVPR LRHnTVDGK 240 
PPTWSDFPKG IIVHTMAAVE ALQAKASMEN QPHSKPLPSD IAVIMYTSGS TGLPKG VMIS 300 
HSNHAGITC MAERIPELGE EDVYIGYLPL AHVLELSAEL VCXSHGCJUG YSSPQTLADQ 360 
SSKKKGSKG DTSMLKPTLM AAVPEMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQI 420 
SKGRNTPIXD SFVFRKVRSL LGGNKUJC GGAPLSATTQ RFMNKfCCP VGQGYGLTES 480 
AGAGT1SEVW DYNTGRVGAP LVCCEKLKN WEEGGYFNTD KPHPRGEHJ GGQSVTMOYY 540 
KNEAKTKADF SEDENGQRWL CIGDIGEFEP DGCLKHDRK KDLVKLQAGE YVSLGKVEAA 600 
LKNLPLVDKI CAYANSYHSY V1GFWPNQK ELTELARKKQ LKGTWEELCN SCEMENEVLK 660 
VLSEAABAS LEKFEIPVKI RLSPEPWTPE TGLVTDAFKL KRKELKTHYQ ADEERMYGRK 



Nudelc Add Accession* 

Coring sequence 



SEQ ID N0:170 PCQ7 DNA SEQUENCE 
none found 

33-1075(undalned sequence corresponds to start and stopcodon) 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



AGCAACGACG 
CCTGCTGCTG 
OTGCAACATA 
GTGTGAOGGS 
GTCGAAATGT 
CTTCCGGTGC 
AAACCCTCTG 
GAGCTTCATC 
AAGTTCTCAA 
TTACCCCAGC 
CCTGCTGGCA 
GCACCGGCTS 
CTGCAACGTC 
GAATGCGTCG 
TGCGTGGTAT 
06A0CTGCCC 
CAGCAGCCTC 
GGGCACIGCT 
AGTTATTCCA 
TGCTCATGGG 
AACTATCTCT 
TGACATGATC 
CACCCPCATT 
AAAIAGGCTG 
CGCTGGACCC 
ATGATCTAAC 
ATCAAAACCT 
AAGAAAACTT 
AAGGACTCTG 
CPCATTCTGA 
GAGCCCCTCC 
TACACCTGCC 
AOCTGGCCGT 
GTATGTCCCT 
CTCCAAAGTT 
ACTGGTTTCT 
CTGCACTGTG 
GGTCAGGGTC 
AQACAATTTQ 
TGAAACAGTG 
AGCTGTCTCT 
ACACCCTTGC 
ACATTTGTGC 
AGAGGGACTC 
TTCTCTGTGT 
AGGTGTTGTT 
GCACTCOGGG 
AACCTGTTTG 
TGATCCTGTT 



11 
I 

CCGGGCAGCG 
AGCAGCGCCG 
CCAGGCAACT 
CTGCCTGACT 
GGCCCAACCT 
AATGGGTTTG 
CTTTGCTCCA 
TGCGATCGAC 
GAACCCGGCA 
ATCACCTATG 
CTGGTCTTGC 
CAGCACCCTG 
ACCTACAACC 
GAAGTAGGCT 
GAOCTTCCTC 
CCCTACCGCT 
CTGAGCGTGG 
GAGCCCAGGG 
AAGTCCATAT 
AAGCTCTTTA 
GCATTCCCCT 

TTTCACATTA 
GGAGAGAGCA 
AATTCTCTCT 
CAGGAGGCCA 
GCTTTGCACA 
TGGACGTGAG 
AAACCATCTA 
GAGCTTTCCT 
CATGAGTTTA 
CTGGCTCTAC 
AGCCAAGGAA 
GTGGCCCACA 
OCCTTAACAC 
ATCACAGGTO 
CACGCTCCTC 
AGGCCTCTCC 
GAGTCAAGAT 
TO l T TC m 'f 
WT TTTC m ' 
CCCGCTOAGC 
ATTGTTGCAC 
CTCTCTCCCT 
CCAGTCAGCC 
TGGCAAGAAA 
CAGCTGTCAC 
ACGCTAATTA 
CTGTAGACTT 



21 
I 

GGAGCGGCGG 
OGGAGAGCCA 
TCATGTGCAG 
GCTTCGACAA 
TCTTCCCCTG 
AGGACTGTCC 
CCGCCCGCTA 
AGAATAACTG 
GTGGGCAGGT 
CCATCATCGG 
ACCACCAGCG 
TGC7GCTGTC 
TCAATAATGG 
CCCCACCCTC 
CACCGCCCTA 
CCCGGTCCGG 
AAGACACCAG 
ACTCTGAGCC 
GGGTTAATCT 
AGCACCTGTA 
CCTCCCCCAG 
CTTTTCTGTC 
TTCTQTTTCT 
ATGTTTCTGT 
GCTGGGTAGT 
TCACTGGATG 
ATCCTATTTG 
TAACACCCTT 
CCCTGTATAA 
CAGCAGCATA 
TCCAAGTTCT 
AGCCACTTAC 
TGAGGACCTA 
CCCAGCCTGT 
TTGCAAAGTC 
AQAGCCATGT 
TTCCCAAGGT 
CAACATCCCA 
TTTCCATTTG 
TTCCCTTCTA 
TTCCTTTAAC 
CCCGTGATAA 
TTTGAGGTTA 
CCGTGTATAG 
ACAGGGCCCG 
CCACACTGAC 
CCATTCAGAA 
AAACAGAGCC 
T W-T H CTT T 




CATCCAGTAT 
CTACTCOGAG 
CTCTTCTGAC 
GAGTGCCAAC 
CCACAGCOCG 
CAGCCAGGGC 
GCTCTGACTT 
AGGATGTCTC 
ACTTCAQAGA 
AGGTCACTCT 
GTTGGAGAGA 
GCTATATTGG 
TACCTTATAG 
GTCACOCCCC 
ATGCCCCCAG 
CAGCAGTOGC 
ATTCTGGCTT 
TATCATCAGC 
CAGCTCCTAA 

ACTTGAGTTG 
CTTGCTCATT 
CTTTTTACCT 
TCAATACCTC 
CCCAATACCA 
GTAGTTTCTC 
GATCTATTTT 
GTTAAGGGAC 
AAGGTCCAAA 
CAAGTCACTC 
TTATTTATCA 
TCTCTATGTT 
CCTCCCTGCA 
TGATGAGGGG 
CTTCTTTCCG 
TGCAGGAAGT 
TTTTAACCAA 



41 
I 

GGGAACAACT 
7CCATCCCGG 
AAGGAGW3CC 
ATOCATTGCA 
GATGAAGAGA 
AAOGGCCTCT 
AGTGATGAGG 
TCAGAGAACC 
ATTTTTGTGC 
AACCTCATGA 
GTCCTGGACC 
GTGGCCAGCC 
GGCTTGCTGG 
ACGGAATCTC 
AGTGCCAGCT 
GGGCAGCCTG 
ACTGAAGAAQ 
GTTGCCATTC 
AAGTTACAGT 
TGTTTTTCTG 
TCCCTTGGGA 
CAGCAXATAA 
ATGCTCAGAA 
CATTTGGGGA 
CAAAAAAATT 
TTCAGCAGAG 
AACGTTATTT 
TAGAAATTTG 
CTCATOCTAA 
AATGCAGGCT 
GACTGTCACC 
GCCCAAAGTC 
CATGCACCCT 
GTGCATTTGG 
CAGCAAGCTC 
GCAOCTCTAG 
CTCTGAGACA 
AAATCTTTTA 
TA1TTATATG 
GAAAGATGCA 
CAGACTAACC 
AGTTCTTGAA 
TGTGCTAGTT 
GGAATAAGGG 
TAAAATCGAA 
CAGCTGAAGA 
GGGGCTAAAG 
AICCAAAGGA 



51 
I 

GGCCGCTGTG 
TCACCAATGA 
GCGCCTGGCA 
CCAAGGCZAA 
TCATTGGTCG 
ACTGCACAGC 
GTATTGACAA 
AAAGCTGTGA 
AACTTGTGTA 



ACCCCCACCA 
AGGCGGAGCA 
AOCAGAGGCC 
TGAACCAAGC 
CCCAGGCAGC 
GCCCCCAGGA 
T ATAAG TCCC 
TAACAATTTG 
TTGGGATATT 
GCGTCTCAGT 
OCOGAGATCA 
AACAGTATTG 
GTGCAGGAGA 
TTTGGGTTAG 
CCATTTGAGC 
TCAGTGGCCA 
TGGTTTTGTG 
CCCAAGAATG 
AATAGGCAGG 
GCCAAOACCC 
CTCCCAGCTG 
TGACCTGGCT 
CAACACTGGC 
ACTTGAGGAC 
TCCTGGCTCC 
TTAGAGTTAG 
CATGGGCAAG 
GAAATGCATT 
TGTATAGGAA 
AAAGGAGATC 
TGTGTGCCAG 
GGAAGCAGAA 
TTTCTTTTTT 
GTAAAACGTT 
CCAGGTAGAG 
AATGTTCAGT 
TGGCATICAG 
TGTTACAGAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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AAGCTAGCCA CTGGTATTTT MHHUHR AAAAAAAAAA GAAAGAAAGA AAGAAAGAAA 3000 

AACGGAAAGO AACCTAGCTG CCTGTATCTT TCATTTTTAA AATAGCACTT GAQTTATTTT 3060 

CTGAGTAATC CAATAAAGAA CTTTTGATGA CAGCCAGAAT GTGTTAGAAC TCTGGCT6AA 3120 

CATTTCATCT CCTGTGAGTC AGAAGGGCTT TATTTCTCCC TTTGATGGGG CCCCTTCTTC 3180 

TTTCTUUTm.' TCTGGAASTT GTTTAOAGGA AAGAATTCTA ATTTTAATTA ATTGCGCAGT 3210 

GAGTTAATCT CACTCGCTTT TCTGCTTCCA GGCATCTTAG GAAAAACAAA TGGTTTTAGT 3300 

ACATAAGGGA TGCCTACTAA TGCTTTTTtA AAACAAACAG GGACATTTTT ATTATAGATT 3360 

TOATTTTTTT AATGAAIGTT TTTAAAAATA TATAAATAGG ACACCAAAGC GGCAGGGTTT 3420 

TTTTTGGGGG GAOGGGGTTT GTTTTCCAAC TCAAGATGGC ACATTAGTGG CCAGCAATAT 3480 

TTTTTAACTC ATTCCAACCA GGAAGCTTTT TTATACATTG CCTAAATCTA 06CCAACCA6 3540 

AAAATAGTCT CATCTC TTT T TTTCTCAAAT GAGATCC6TG TTTTATTTTA GCATTAAATT 3600 

AGTTACACTG TGATGACTGG CCTATTACCT GACTCAGCTC CCTCTACCTT GAAATTGACA 3660 

TTTTTAAAAA ATGCAACTAA GTGGTTAATA GTGTGTGACG CTCAAAGTTA ATGTAAACTC 3720 

GAAAGGTTGT GTGTCGTTGC TTTT T gTCTT TTGGTTAGGC TTGGTTTTGT TTTTTAATTT 3780 

TTAXACTTTC TAATAAATTT GCAGTTTCAT TCTTTCTGTT TGTOCAAAWG GWMCTAMARM 3840 

AAMHAAAAAC AWYWTTGGGG GG G CTTGGGC CTCGGAAAAA GTTTTTAACA CCACTTCGGG 3900 

T66GGCGGC0 GOGCCCACGT AGGTACGGCG ACCACGCGGG CCCAAACGGG AOCCCAGAAQ 3960 

GAAACCCTGG CCAAGAAAAA GGTGG06AGA ATTCTCCACA CCAGAAAAAA ACGCGCCGGG 4020 

GGAAACCGCA GAGTGTTGOQ TAAACCACAC CCGAAGAGAO AACTCAGAAQ CACACAAGCG 4080 
GQACTCAACC AGGAGGACCC AAGOOAAOCC GATAGAGTAC G 



SEP ID H0:171 PCQ 7 Protein semiencg 

Protein Accession*: nonafound 

1 11 21 . 31 41 51 

I I I I I I 

MWLLGPLCLL LSSAABSQU, PGHHFTNEOJ IPGNFKCSNG RCIPGAWQCD GLPDCFDKSD 60 

EKECPKAKSK CGPTFFPCAS GIHC1IGRFR CNGFEDCPDG SDEENCTANP LLCSTARVEC 120 

KNGLCIDKSF ICDGQtJNCQD NSDBBSCBSS QEPGSGQVFV TSENQLVYYP SITYAIIGSS 180 

VXPVLWALL ALVLHBQRSR HNUfTLPVHR LQHPVLLSRL WLDHPEBCN VTYHVNMGIQ 240 

YVASQAEQNA SEVGSPPSYS EALLDQRPAW YDLPPPPYSS DTESLNQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSKS PGQPGPQEGT AEPHDSEPSQ GTEEV 

SEQ ID N0:172 PELS DNA SEQUENCE 
Nucleic Add Accession f. KMJ0O5658.1 

Codngsequsnce: 57-1535 (underHned sequences correspond to start and stop codtms) 

1 11 21 31 41 51 

I I III I 

GTCATATTGA ACATTCCAGA TACCTATCAT TACTCGATGC TGTTGATAAC AGCAAGATGG 60 

CTTTGAACTC AGGGTCACCA CCAGCTATTG GACCTTACTA TGAAAACCAT GGATACCAAC 120 

CGGAAAACCX CTATCCCGCA CAGCCCACTO TGGTCCCCAC TGTCTACGAG GTGCATCCGO 180 

CTCAGTACTA CCCGTCCCCC GT GCCCCAGT ACGCCCCGAG GGTCCTGACG CAGGCTTCCA 240 

ACCCCGTCGT CTGCACGCAG CCCAAATCCC CAICCGGGAC AGTGTGCACC TCAAAGACTA 300 

AGAAAGCACT GTGCATCACC TTGACCCTGG GGACCTTCCT CGTGGGAGCT GCGCTGGCCG 360 

CTGGCCTACT CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GAGTGCGACT 420 

CCTCAGGTAC CTGCATCAAC CCCICTAACT GGTGTGATGG CGTGTCACAC TGCCCCGGCG 480 

GGGAGGACGA GAATCGGTGT GTTCGCCTCT AOGGACCAAA CyTCAICCTT CAGATGTACT 540 

CATCTCAGAG GAAGTCCTGS CACCCTGTGT GCCAAGACGA CTGGAACGAG AACTACGGGC 600 

GGGCGGCCTG CAGGGACATG GGCTATAAGA ATAATTTTTA CTCTAGCCAA GGAATAGTGG 660 

AIGACAGCGG ATCCACCAGC TTTATGAAAC TGAACACAAG TGCCGGCAAT GTCGATATCT 720 

ATAAAAAACT GTACCACAGT GATGCCTGTT CTTCAAAAGC AGTGGTTTCT TTACGCTGTT 780 

TAGCCTGCGG GGTCAACTTG AACTCAAGCC GCCAGAGCAG GATCGTGGGC GGTGAGAGCG 840 

CGCTCCCGGG GGCCTGGCCC TGGCAGQTCA GCCTGCACGT CCAGAACGTC CACGTGTGCG 900 

GAGGCTCCAT CAICACCCCC GAGTGGATCO TGACAGCCGC CCACTGCGTG GAAAAACCTC 960 

TTAACAATCC ATGGCATTGG ACGGCATTTQ CGGGGATTTT GAGACAATCT TTCATGTTCT 1020 

ATGGAGCCGG ATACCAAGTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCCAAGACCA 1080 

AGAACAATGA CATTGGGCTG ATGAAGCTGC AGAAGCCTCT GACTTTCAAC GACCTAGTGA 1140 

AACCAGTGTG TCIGCCCAAC CCAGGCATGA TGCTGCAGCC AGAACAGCTC TGCTGGATTT 1200 

CCGGGTGGGG GGCCACCGAG GAGAAAGGGA AGACCTCAGA AGTGCTGAAC GCTGCCAAGG 1260 

TGCTTCTCAT TGAGACACAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 1320 

CAGCCATGAT CTGTGCCGGC TTCCTGCAGG GGAACGTCGA TTCTTGCCAG GGTGACAGTG 1380 

GAGGGCCTCT GGTCACTTCG AACAACAATA TCTGGTGGCT GATAGGGGAT ACAAGCTGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACAGACCAG GAGTGTACGG GAATGTGATG GTATTCACGG 1500 

ACTGGATTTA TCGACAAATG AAGGCAAACO GCTAATCCAC ATGGTCTTCG TCCTTGACGT 1560 

CGTTTTACAA GAAAACAATG GGGCTGGTTT TGCTTCCCCG TGCATGATTT ACTCTTAGAG 1620 

ATGATTCAGA GGTCACTTCA TTTTTATTAA ACAGTGAACT TGTCTGGCTT TCGCACTCTC 1680 

TGCCATACTG TGCAGGCTGC AGTGGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTGT 1740 

CCGCAAGGGG TGATGGCOGG CTGGTTGT G G GCACTGGCGG TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGGCT GCOCCCATTG AGATCTTCCT GCTGAGTCCT TTCCAGGGGC CAATTTTGGA 1860 

TGAGCATGGA GCTGTCACTT CTCAGCTGCT GGATGACTTG AGAIGAAAAA GGAGAGACAT 1920 

GGAAAGGGAG ACAGCCAGGT GGCACCTGCA GCGGCTCCCC TCTGGGGCCA CTTGGTAGTG 1980 

TCCCCAGCCT ACTTCACAAG GGGATTTTGC TGATGGGTTC TTAGAGCCTT AGCAGCCCTG 2040 

GATGGTGGCC AGAAAXAAAG GGACCAGCGC TTCATGGGTG GTGACGTGGT AGTCACTTGT 2100 

AAGGGGAACA GAAACATTTT TGTTCTTATG GGGTGAGAAI ATAGACAGTG CCCTTGGTGC 2160 
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GAGCGAAGCA ATTGAAAAGG AACTTGCCCT GAGCACTCCT GGTGCAGGTC TCCACCTGCA 2220 

CATTGGGTGG GGCTCCTGG6 AGGGAGACTC AGCCTTCCTC CTCATCCTCC CTGACCCTGC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTCCCTGGC AGGGCGCCAA GTTTGGCACC 2340 

ATGTCGGCCT CTTCAGGCCT GATAGTCATT GGAAATTCAG GTCCATCGGG GAAATCAAGG 2400 

ATGCTCAGTT TAAGGTACAC TOTTTCCATC TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 
CTGAGTTCAA AGCCATCTT 



SEQ ID N0:173 PEU Prrtelii samara: 
Protein Accession!: 



NP_005647.1 



HALHSGSPPA 
SNPWCTQPK 
DSSGTCINPS 
GRAACRDHGY 
CIACGVNLNS 
PLNNPWHWTA 
VKFVCLINPG 
TFAMICAGFL 
TOWIYRQHKA 



11 
I 

IGFYYENHOY 
SPSGTVCTSK 
KWCDGVSHCP 
KNNFYSSQGI 
SRQSRIVGGE 
FAQILRQSFM 
MMLQPEQLCW 
QGNVDSCQCD 
KG 



21 
I 

QPEHPYPAQP 
TKKALCITLT 
COEDENRCVR 
VDDSGSTSFM 
SALPGAWfWQ 
FYGAGYQVQK 
ISGWGATEEK 
SGGPLVTSNH 



31 
I 

TWPTWEVH 
LGTFLVGAAL 
LYGPNFILQH 
KLNTSAGNVD 
VSLHVQNVHV 
VISHPNYDSK 
GKTSBVUJAA 
HIWWLIGDTS 



41 
I 

PAQYYPSPVP 
AAGLLWKFMG 
YSSQRKSWHP 

iykklkhsda 
cggsiitpew 
tknndialmk 
kvllietqrc 
kgsgcakayr 



51 
I 

QYAFBVLTOA 
SKCSNSGIEC 
VCQDDWNENY 
CSSKAWSLR 
IVTAAHCVEK 
LQKPLTFNDL 

NSRYvynra,! 

PGVYGNVMVP 



60 
120 
180 
240 
300 
360 
420 
480 



Nucleic Add Accession t: 



AIE947S7 
Cooing sequence: 



SEQ ID N0:174 PBJ4 ONA SEQUENCE 



130-1088 (underfined sequences correspond to start and stop codons) 



CAGAGAGGCT 
GGGGTCACAC 
AGCTTCTTCA 
ATAGGCCTCC 
TACCTTATTG 
CTGCATGAGC 
ACCTCATCCA 
GATGCTTGTC 
CTGCTGGCCA 
GTACTEACGT 
CTGATGGCAC 
TCCCATTCCT 
AATGTCGTCT 
TCCTTCTCAT 
AAGGCATTTG 
ATTGGATTGT 
TTGGCCAATA 
ACAAAGGA6A 
CCCTAGGTGT 
GTTAACATTT 
ATCCTTCAAA 
GTTTTCITGC 
TTTTCATTTT 
OAGAXAAGAA 
TAAACACAGA 
ACTCCCAACC 
AAASAATTTT 
AGAGTACATT 
ATGGACCCTG 
TTAGTACCCT 
GG6GTCATAC 
GGAAGAACTG 
TTCTARAGGA 
GCAACAGAAC 
AATTACCTGT 
AGAAAGTCTG 
TGATAGGCAG 
TGAAGATAAC 
ACCATGCTTT 
ATCTGACTTA 
AXAGGTTTCA 
TACTAAAACA 
CCTGATATGG 
AATGCCTATT 
TATTGAATGT 
AAAGTGCCTA 
TTCCTTCTGT 
TTAAATTTTA 
GCTCATAAAA 



11 
I 

GTATTTCAGT 
ATTCCTTCCA 
TGATGGTGGA 
CTGGTTTAGA 
CTGTGCTAGG 
OCATGTATAT 
TGCCCAAAAT 
TGCTACAGAT 
TGGCTTTTGA 
TOOCTCGTGT 
CCCTTCCTST 
ACTGCCTACA 
AT6GCCTTAT 
ATCTGCTTAX 
GCACTTGGGT 
CCAXGGTGCA 
TCTATCTGCT 
TTCGACAGCG 
CAGTGATCAA 
T6SAAGACAG 
TATGAAACTG 
TflCATATAAT 
ACCATGCAGT 
TGSTACATCT 
ATATAATAAA 
ACATTG6ATC 
TCCTCTGGAC 
TACCTACGTT 
TTTTTCCTAT 
CATTGTAGCC 
AAGTATAAAA 
TTAAAQA6AC 
GGTATTTAAT 
TCATGGCTTT 
GTCTTGGAAG 
CATAG6GCTT 
TGAGGTTAQG 
ATTGGCCTTT 
ATTTGGGGCT 
GGCATGGGAA 
TCTTCAACAG 
TGTGATCATA 
ATTCCTATOA 
TAATACTTGT 
CATCTCTGTT 
GAACATAATA 
GCT6AACACA 
GCCATTACTT 
CCCTCCCATG 



21 
I 

GCAGCCTGCC 
TACGGTTGAG 
TCCCAATGGC 
AGAGGCTCAG 
TAACTTGACA 
ATTTCTTTGC 
GCTGGCCATC 
GTTTGCCATC 
OCGCTATGTG 
CACCAAAATT 
CTTCATCAAG 
CCAAGATGTC 
CGTCATCATC 
TCTTAAGACT 
CTCTCATGTG 
TCGCTTTAGC 
GGTTOCTCCT 
CATCCTTOGA 
ACTTCTTTTC 
TATTCAGAAA 
GTTGGGGAAT 
TAXTAATACC 
CCAAATCTAA 
AGAGAACATT 
ATGAGATAAT 
TCAQAAAAAT 
ACTAGCACTT 
AATGAAAGTT 
TTAATTTTCT 
ATGGGAAAAT 
ATTAAAAAAA 
CAACAGGGTA 
TTCTTCTCAC 
AATOCCACTA 
AAGTGATTTC 
ATAGCAAGTT 
GACCCACCAG 
TGAGTGTGAC 
TTGTGCAGTA 
TCAGGCATTT 
GATATGACAA 
TATGTGGTAA 
CATGCTTTCA 
ATTTGCTGCT 
CATCATTGAC 
GTGCTTATGC 
TAGCCAGGCA 
CCAATGTGAG 
TGCAGOCTTT 



31 
I 

AGACCTCTTC 
CCTCTACCTG 
AATGAATCCA 
TTCTGGTTGG 
ATCATCTACA 
ATGCTTTCAG 
TTCTGGTTCA 
CACTCCTTAT 
GCCATCTGTC 
GGTGTGGCTG 
CAGCTGCCCT 
ATGAAGCTGG 
TCCGCCATTG 

TGTGCTGTGT 
AAGOGGCGTG 
GTGCTCAACC 
CTTTTCCATG 
CATTCAGAGT 
AAAAATTTCC 
CTCCATTTTT 
CTGACTAGGT 
ACTGCTTCTA 
TGCCAAAGGC 
CTAGCFTAAA 
ACTGTCTTCA 
AAGGGGAAGA 
GACACACTGT 
TATCAACCCT 
TGAT0TTCAG 
AAAGACTTCA 
GTGGGTTAGA 
TCATCCAGTQ 
GCTATTGCTT 
XAGGTTCAOC 
ATTTATTTTT 
TTATGATGGG 
TCGTAGCTGG 
TGGAACAGGG 
TTGCTTCTGA 
CAGTCTTAAC 
GTTTCATTTT 
TGCCCTTTTG 
GGACTGTAAG 
TGCTCTTTGC 
TTGACACCGG 
ATTTTCCAGC 
TGGAAGTGAC 
CATGTTGACA 



41 

I 

TGGAGGAAGA 
CCTGGTGCTG 
GTGCTACATA 
CCTTCCCATT 
TTGTGCGGAC 
GCATTGACAT 
ATTCCACTAC 
CTGGCATGGA 
ACCCACTGCG 
CTGTGGTGCG 

CCTGTGATGA 
GCCTGGACTC 
TGACACGTGA 
TCATATTCTA 
ACTCTCCACT 
CAATTGTCTA 
TGGCCACACA 
CCTCTGATTC 
TTAATAAAAA 
TCAATATTAT 
TQTGGTTGGA 
CTGATGGTTT 
CTAAGCACAG 
ACTATAACTP 
AAATGACTTC 
TTGGAAGTAA 
TCTGAGAGTT 
TTAATTAGGC 
TGGGGATCAG 
TGCCCAATCT 
GATTTCCAGA 
TTGTATTTAG 
ATTGTCCTGG 
ATTATGGAAG 
AAAAGTTCCA 
AAGTATGGAA 
AAAGTGAGGG 
ACTTTGAGAC 
GGGGCTATTA 
CAAGAAACTC 
CTTTTTCAAT 
TAATGGATAT 
COCATGAGGG 
TCATCATTGA 
TTATTTTTCA 
CTTCTTTGAG 
ATGTGCAATT 
TTAAATGTGA 



SI 



CTGGACAAAG 
GTCACAGTTC 
CTTCATCCTA 

TGAGCACAGC 
CCTCATCTCC 
CATCCAGTTT 
ATCCACAGTG 
CCATGCCACA 
GGGGGCTGCA 
CAATATCCTT 
TATCCGGGTC 
ACTTCTCATC 
AGCCCAGGCC 
TGTACCTTTC 
GCCCGTCATC 
TGGAGTGAAG 
CGCTTCAGAG 
AGATTTTAAT 
TACAACTCAO 
TTTCTTCTTT 
GGGTTATTAC 
ACAGCATTCT 
CAAAGGAAAA 
CCTCTTCAGA 
TACAGAGAAG 
AGCCTTGAAA 
TTCACAGCAT 
AAAGATATTA 
TGAATTAAAT 
CATATGATGT 
GTCTTACATT 
GAATTTCCTG 
TCCAATTGCC 
ATTCTTATTC 
TAGGTGTTTC 
TGGCAGGTGT 
AATCTTCAGG 
CGGGAAAGCA 
CCAAGGGTTA 
AAATTACATA 
CCTCAGGTTC 
CATATTTGGA 
CACTGTTTAT 
ATCCCCCAGC 
TCAAACCTGA 
TTGGGTATTA 
TTTATACCTG 
CTTGGGAAGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
S00 
960 
1020 
108O 
1140 
1200 
1260 
1320 
1380 
1440 
1S00 
1560 
1620 
1680 
1740 
-1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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5 
10 
15 
20 



TATGTCTTAC ACAGAQTTAA TTAACCNGAA AGGCCTGGNA ATTTTTTGNN AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTmODDL ATTTGGCAAN NTCCCACTTT GTANTTTGGT 3060 
AAGGAQGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



Protein Accession* 



SEQ ID HQ:m PBJ4 PROTEIN SEQUENCE 
not avaSdite, cloned alEos 



1 11 21 31 41 51 

I.I I I I I 

HVDPNGNBSS ATYFILIGLP GLEEAQPWLA FPLCSLYLIA VLGHUTIIYI VRTEHSLHEP 60 
MYIPLCMLSG IDILISTSSM PKMLAIPWFN STTZQFDACL LQMFAIHSLS GHESTVLLAH 120 
AFDRYVAICH PLRHATVLTI. PRVTKIGVAA WRGAALMAP LFVFIKQLPF CRSHILSE5Y ISO 
CLHQDVMKLA CDDIRVNWY GLIVIISAIG LDSLLISPSY LLILKTVI/3L TRBAQAKAFG 240 
TCVSHVCAVP IPYVPPIGLS MVHRPSKRRD SPLPVILANI YLLVPPVLNP XVYGVKTKBI 300 
RQRILRLFHV ATHASEP 



Nucleic Add Accession ft 
Coding sequence: 



SEQ ID NO-.176 PM72 DNA SEQUENCE 
NM.004S24.1 

57-1 544 (underlined sequences coirespond to start and stop codons) 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TCGGAGCCTC 

TGQTGGTCGC 
GCGGCGGCGG 
CGCTCTT6GO 
ACAAGCAGTG 
GGGACAACCT 
COCTCATCTT 
ACGAA6GCTG 
AGGCAGCGAG 
CCATTGGCTA 
TCAGGAAGCT 
TGAG6GCTGC 
AGTGCTCGGA 
TGGCTAACTT 
CCTTCTTCTC 
GCACATTCAC 
GGTGCTGGGA 
CCATCTTGGT 
GGCCCCCAGA 
TCCTGCTGAT 
TTAAGCCTGA 
TGGCTATCCT 
GGCGCTGGCA 
GCAGCAACGG 
CCCGCCGCTC 
CCAAGCGGCC 
GGGCGCGCCA 
GGACACTCCT 
GATGGGAGCT 
AGGCCCCCTA 
TGCTGGCTCT 
TGACCTOAGG 
OCTGAAATTT 
GACTGAAGAT 
GTGGGTTATT 
GTGGACTGGC 
CTQAAGCCTC 
TACCTGCTCT 
TTCTTATCTC 
CACCTATQTQ 
AAGCAGATCC 
GTGAAAGCAC 
TTATTTGTTT 
CCCTCCCT60 
CTGG1CACAG 
CCTCTGCCAG 
GGAAAAAAAA 



COGAOOQTOG 
TCTGCTCTCG 
GGCGGCCGGG 
CCGAGGT66G 
CTCCTCGCTG 
CCTGGAGGAG 
CACCTGCTGG 
CAAGCTCTTC 
GACCCACCTG 
TTTGGATGAG 
CGGCCTGTCC 
CCACTGCACG 
CGCTGTCTTC 
GGCCTCGGTG 
CTTCTGGCTG 
TGAGCGGAAG 
CATGGTGTGG 
CACCATCAAC 
AAACTTCATC 
TATCAGGAAG 
CCCCCTGTTT 
AGTGAAGATG 
CTACTGCTTC 
CCTGCAGGGC 
CGCCACGTGC 
CTCCAGCTTC 
CCTCCCGCCC 
GCOCCGGCCC 
AGAGAACGCA 

CGCCAATCAA 
TCTGCCCAAT 
GCAGAAAGGT 
CACCATTGCT 
GCAGCTCACT 
CTGGAGTTTT 
CCCTGGGTCA 
TGGQAAATGA 
CCAAGTCTCA 
TCTGTGCTGT 
CCAACTGTTO 
TCACCCTGCT 
GGACTCTTAC 
ACCACTTGTA 
AGTGT6GCTG 
CCTCCTCTGT 
AftGATCCCCT 
AAAA 



TGGTGGTGGT 
CTCAGGCGCC 
GCTCGCTCTC 
GTCGCGCGGC 
CAGGA6GAGT 
GCCCAGCTGG 
CCAGCCACCC 
TCCTCCATTC 
GAGOCTGGCC 
CAGCAGACCA 
CTCGCCACCC 
CGGAACTACA 
ATCAAAGACT 
GGCTGTAAGG 
CTGGTGGAGG 
TACTTCTGGG 
ACCATCGCCA 
TCCTCACTGT 
CTGTTTATTT 
AGTGACAGCA 
GGAGTACACT 
GTCTTTGAGC 
CTCAATGGTG 
GTCCTGGGCT 
AGCACGCAGG 
CAAGCC6AAG 
CTTCCCACTC 
TGGGCTCGGA 
GCCCTAGAGC 
AGGATGCAGG 
GGGCAAAAAG 
T6GA6GAAAG 
TCTGCCCGGG 
GTCAAGTTCC 
ACCCTATTCT 
TGTTTGGAGA 
GTCTGGTGGG 
0AAGGCA6CC 
GTGGCTTCAT 
GGAAGCAACA 
TAACTAGGCT 
ACACATACAG 
TGCTAACTTT 
TTATTAATGC 
AGGAGGCCTC 
CTGCCCTTCA 
CAGGACTGCA 



TCGGTGGCGG 
GGGGAGGCCG 
GGAGGCGGCT 
GTGACTATGT 
AGAATGAGAC 
CTCGGGGCCA 
AAGGCCGCAA 
CGTACCCCAT 
TGTTCTACGG 
TTCTGQTCGC 
TCCACATGCA 
TGGCCCTCTT 
CAGCCATGGT 
GCCTCTACCT 
GGTACATACT 
GGATCCATTT 
GGTGGATCAT 
GCATCATCCG 
GTCCATACTC 
ACATCATGTT 
TCGTCGTGGG 
AGGTGCAGGC 
GGAACCCGAA 
TTTCCATGCT 
TCTCCCTGGT 
GCAGCAGACG 
GGCTGCCCCC 
CTGCCTGGAG 
TGGAACTCAG 
TCTACATACT 
CAACCGGTGG 
AAGGTCACCA 
TTTGGGTTAA 
CTOTTTACGC 
GCACACCTAT 
AGGACGGTGC 
ACCAGCGAAT 
CTGTCAAGTG 
GGAATCAAGA 
CAGAGATGTG 
GATTTGAACT 
TGTGTATCGT 
CATTATCCCT 
CATCTCATGT 
CCCCAGTGGC 
ACAGGCTTGT 



CTCGCCCGCC 
TTGGTCGGCG 
GGGCGGATCT 
OGAGCTTCGT 
GCAGATGATC 
AATAGGCTGC 
GGTAGTTGTC 
TGTAAGCCGC 

TTCTGTGAAG 
CACAGCTATC 
CCTCTTCATA 
CGACAGCGGG 
CTTTTTCCAA 
GTACACCCTG 
CATCGGCTGG 
TGAGGATTAT 
AAAGGGCCCC 
AATCCTGCTT 
AAGGCTAGCC 
OGCCTTCTTT 
GTCTTTCCAG 
GGAGCTGAGG 
ATACCGGCAC 
GACCCGCGTC 
CTGACCACCA 
CCGGGGACAG 
GGCCOCCTGG 
CGTTTCTAGC 
TCATTAGACT 
TTCATCCTGA 
ATCCTCAAAC 
GCACCAACAC 
GCATTACCAC 
TTAGTTATCA 
CTTAGTGQTT 
AACCCAAGGA 
GCTAGGTCTC 
GGACTCTGTC 
GACTGCCCTC 
CACCCATGGO 
CAGATCTGTC 
AACCAGCCAG 
GAATTCCCCT 
ATCATCTGGA 
CACTCAGCTT 
GCAACAATAA 



TCACTCATGC 
GTTACGCGGC 
CGCGGCGCAG 
GCTGCGCGCT 
GAGGTGCAGC 
AGCAAGATGT 
TTGGCCTGTC 
AGCTGCACCG 
TTGGATGACA 
ACCGGCTACA 
CTGAGCCTGT 
TCCTTCATCC 
GAGTCGGACC 
TATTGTGTCA 
CTTGCCGTCT 
GGGGTACCCA 
GGTCTGCTCA 
ATCCTCACCT 
CAGAAACTGC 
AGGTCCACAC 
CCGGACAATT 
GGTTTTGTGG 
CGGAAGTGGC 
CCGTCGGGAG 
AGCCCAGGTG 
GGATCCCAGC 
AGGCCTGCCC 
TCTCTGGTCC 
AAGTGAGAGA 
CCTCCTCCAA 
CTCTGCCCCC 
AACACTGGTG 
CACGGTAGTG 
TCAGGCATTT 
GCTTTTTAAA 
CCCCACCGAA 
CTGAGGGACT 
GGACTAAGCC 
ACACCAGCCA 
CTTGTCCACC 
CTCTGACAOA 
TGATAGGAAT 
ATCCTCTTGG 
TGCCACCCCA 
TAGGAGCCTG 
CCTACCCACA 
ATGtTGGCTT 



60 
120 
ISO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
10S0 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



SEQ ID NO:177 PM72 Protein sequence: 
Protein Accession*: 



JC2195 



31 



41 



51 



1 11 21 

I I I I I I 

MPPPPLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA HRRRLELRAA 
RSLLGSSLQE ECDYVQMIEV QHKQCLEEAQ LEKETIGCSK MWDNLTCWPA TPRGQVWLA 



60 
120 



373 



WO 02/30268 PCT/US01/32045 



10 



CFLIFKLFSS IQSRNVSRSC TDBGWTHLEP GFYPIACGLD DKAASLDEQQ TKFYOSVKTG 180 

OTIGYGLSIA TIXVATAILS LPRKLHCTBN YIHMHLPISP ILRAAAVPIK DIALFDSGES 240 

DQCSEGSVGC KAAMVFFQYC VKANPFWLLV EGLYLYTLLA VSFPSERKYF NGYILIGWGV 300 

PSTFTMVWTI ARIHPBDYGL LRCWOTMSS LWWIIKGPII. TSXLVNFILF ICIIR1LLQK 360 

LRPPD1RKSD SSPYSRLARS TLLLIPLFGV HYIHFAPFPD NFKPBVKMVF ELWGSPQGP 420 

WAILYCFLN GEVQAELRRK KREKHLQGVL GWNPKYRHPS GGSNGATCST OVSMLTRVSP 480 
QAHRSSSFQA EVSLV 



Nuddc Add Accession #: 

Coding sequence: 



SEO ID N(h178 BFF8 DMA SEQUENCE 

AL133619 

1-2070 (underlined sequences correspond to start and stop codons) 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



1 11 21 31 

I I I I 

ATGAGCGGTO CGGGGGTGGC GGCTGGGACG CGGCCCCCCA 
CGOCGCCCGC GCCAGCGCCC CTCTGTCCGC GTCCAGTCCT 
CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG 
CAOCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG 
GAAAACAAGG GTGAGCCGGC GCGGGGCCCT AGGCC6GCCC 
ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT 
GGGGGAACAC AGGACGGGOA C C CCCTCCAQ ACTQTCCTTG 
CCTGTATGCC AACCCAOTGG GTACAGGTTC TGGGGGAOCT 
AGCCG7GGCT GGACGATGTT ATGCAGCCAA GCACAGCACG 
GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT 
CCAAGTAGAG CTGAAATGGG AAGGAACOCC TGGGACAGCC 
CCTCAGATTG CTGCTGTGGC CAGGCCCASG ATTTCCAGCC 
ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAG6 
GCAGCAACCA TGGGGACAAA GGGASGAAGC AGAGTCCTGT 
GCACTTCCCC ATCCTCACAG CGGCCCCCAC CCA6CCCAGG 
GCTCACTTOC CATTATCTTT GGGGCTGGGG CTGACATCAG 
TGQAGCCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA 
GACATGGA6A AGGGGOTTSA GGGAGGGCCC TTCCCTAGCC 
CTGTTCTGGG CAAAGTGTGG CCCAAGTCGG CAGCCCCAGC 
GACAGGACAC GGGAAGAGGC CATGCTTTCC CTCGGGACCT 
CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT 
GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGSTAGAGC 
AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG 
GGCGGTAGCG CCGACACTGT GCGCTCTCCT GCAGACAGOC 
TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA 
TCCTTCAACA AGCAAGATTC AAAAGCTGAC GTCTCCCAGA 
CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG 
GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG 
AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC 
ACCACACTTA GGCAGTGCGA AGTGCTCATC CGCGAGCTGT 
ACGCAAGAGC TGCGGCACCT CAAGTCCCTC CTGGAAGGGA 
CCGGAGGAAG CTAGCTTTCC CAGGGAGCAA GAAGCCACGC 
AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG 
CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA 
AAACGGCGCC TCCATOGCTC ACTCCTTI2A 




GCTGTGGCAA 
CCTGCAGTGC 
GCTGTTCCAT 
CCAGGGCCTC 
CGGGAGGACC 
GCAAGCGTGG 
TCTCCATGTC 
AGGGCAGGCC 
AGGCGGACCT 
TACAAGGGCA 
GGAACAGCCA 
CCCTTCCCCT 
GGAATACCAA 
GCCAGAGGCC 
ATTTCCCCAA 
AGCGTGCCAT 
AGAGGCTGCA 



CAGCCCTGCC 
GCGTCTTGCG 
AAGCTTCCAG 
CCAGCCCGGC 
GGAAGAGGAG 
GGCCAGAAAG 
GCACCAGGGC 
GCGAAAGCCC 
CCTCCTSCAG 
CCAGGCAGCC 
GGTCTCCAOC 
CCTGCOCGCA 
GGCAATGCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



SEQ ID KO:i79 BFF8 Protein setmenca 
Protein Accession #: 



1 
I 

HSGAGVAAGT 
QQQHSEMLAK 
GGTQDGEPLQ 
GFEVIAGRjQV 
MLGAQGIWTH 
ABFPIiSLGLG 
LFHAKCGPSR 
GARWVCINGV 
SVKSISNSAN 
EKAEASNAGA 
TQELRHLKSZj 
LKQTPKNNFA 



11 
I 

RPPSSPTPGS 
LHEEIEHLKR 
TVLAHLAALA 
ATGCSPDLPP 
SIQGSLPAIW 
LTSGGHLTGG 
QPQPCSAGDA 
WVEPGGPSPA 
SCjGKARPQPG 
ACMGNSQHQG 
LEGSQRPQAA 
ERQKRLQAHQ 



21 
I 

RHSHQRPSVG 
ENKGEPARGP 
PVCQPSGYRF 
PSRAEKGRNP 
AATMGTKGGS 
HS0PGNXAAG 
DRTREEAMLS 
RLKEGSSRTH 
SPNKQDSKAO 
■RQMGAGAHPP 
PEEASFPRDQ 
KRRLHRSVL 



T43457 



31 
I 

VQSLRPQSPQ 
RPALPPQAHS 
VrGIWTOAATS 
WDSPCPARSL 
RVLFPCHLSK 
AVPRALPSQG 
LGTCCSKCPK 
RPGGKRGRLA 
VSQKADLEEE 
KELPLPLRKP 
EATHFPKVST 



41 
I 

LRQSDPQKRH 
TLPLPQHRNT 
SRGWTMLCSQ 
PQIAAVARPR 
ALPHPDSGPH 
DMEKGVEGGP 
PSCPPDGPSG 
GGSADTVRSP 
PLLHNSKLDK 
TTLRQCEVLI 
KSLSKKCLSP 



51 
I 

LDLEKSLQFL 
AINSSTPXGS 
AQHVXjLSGSP 
ISSPMALSPH 
PAQDPGLWSQ 
PPSRCGNSSE 
NHLSRASAPL 
ADSLSHSSFQ 
VPGVQGQARK 
RELSJNTHLLQ 
PVAERAILPA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



Nucleic Add Accession #: 



11 



SEQ ID NO:180 8CR4 DNA SEQUENCE 
NM_01231ft2 



Coding sequence: 



13S-2405 (underlined sequences correspond to start and stop codons) 



21 31 41 51 

I I I I 1 I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGOGCCTG GTAGAGATTT CTCGAAGACA 
CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 



60 
120 



374 



WO 02/30268 



GCGGAGACGA AGGCCCAATG GCGAGGAAGT TATCTOTAAT CTTGATCCTQ ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATQ GAQAAAATAA TTCTTTQTCA GTTGAAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAQ ATAAGATTAA AAGAATCCAT ATACACCATG 420 

ACCACGAOCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATQ 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAQAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTQTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTOC AAGACCTGGA AAACTCTTCC 780 

OCAAAGATGT AAGCAGCTCC ACTCCACCCA GTGTCACASC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GIGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCAIG 960 

GCATGGGCAT CCAGGTTCCG CTGAA7GCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TPCAIACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGOTT G GTGG ITTTAXAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TACTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTQTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TSAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGIATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG A7ACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGQAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TCCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG CTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

TCTAGTTAAG GTTTAAAIGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATCA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATAOCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT AEAGACTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAOAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGOTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 



SEQIONOMBI BCR4 PROTON SEQUENCE 
Proteh Accession I: NP.036451 



1 11 21 31 41 51 

I I I I I I 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW ESGmVDLAI STRQYHLQQL 60 

PYHYGEKNSL SVEGFRKLLQ NICIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPFN SQGKGAHRPE HASGRBNVKO 180 

SVSASEVTST VYNTVSEGTH FLETIETPRP GKLFPKDVSS STPPSVTSKS RVSRLAGRKT 240 

NESVSBPRKG FKYSRHTHEN PQECFNASKt, LTSBGMGIQV PLHATEPNYL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISFLSL LGVILVPLBN RVFFKPLLSF 360 

LVALAVGTLS GDAPLHLLPH SHASHHHSHS HEEPAHEMKR GPLPSHLSSQ UIEESAYFDS 420 

TWKGLTALGG LYFMFLVEHV LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 480 

EEKVDTDDRT EGYLRADSQE PSHFDSQGPA VLEEEEVMIA HAHPQEVYNE YVPRGCKNKC 540 

ESHFHDTLGQ SDOLIHHKHD YHHILHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVI 600 

MGDGLHHFSD GLAIGAAPTE GLSSGLSTSV AVFCHELPHE LGDFAVLLKA GMTUKQAVLY 660 

NALSAKLAYL GHATGZPIGH YAENVSMWIF ALTAGLFKW ALVDMVPEML HNDASDBGCS 720 
HWGYPFLQNA GHLLGFGtHL LISIPEHKIV FRJNP 



375 
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SEQ 10 HO:182BCY2DNA sequence 
NudetoAcMAccesstaii: NM.001203 

Coding sequence: 274-1782 (undeAned sequences ctxrespond to slart 2nd stop cadans) 

1 11 21 31 41 SI 
I I I I I I 

CGOGGGGCGC GGAGTCGGCG GGGCCTCGCG GQACGCOOOC AGTGCGGAGA CCGCGGCGCT 60 
GAGOACGCGC GAGCCGGGAG CGCAOGCGCG GGGTGGAGTT CAGCCTACTC TTTCTTAGAT 120 
GTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA CGTTCSGACT TCTGCTGATT 180 
CATAACCATT TGGCTCTGAG CTATGACAAG AGAGGAAACA AAAAOTTAAA CTTACAAGOC 240 
TGCCATAAGT GAGAAGCAAA CI ICC1 1UAT AACATGCTTT TGCGAAGTQC AGGAAAATTA 300 
AATGTGGGCA CCAAGAAAGA GGATGGTGAG AGTACAGCCC CCACCCCCCG TCCAAAGGTC 360 
TTGCGTTGTA AATGCCACCA CCATTGTCCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
GACGGATATT GTTTCACG AT GATAG AAG AG GATGACICIG GGTTGCCTGT GGTCACTTCT 480 
GGTTGCCTAQ QACTAGAAGG CTCAGATTTT CAGTGTCGGQ ACACTCCCAT TCCTCATCAA 540 
AGAAG ATCAA TTG AATGCTG CACAGAAAGG AACGAATGTA ATAAAGACCT ACACOCTACA 600 
CTGGCTOCAT TGAAAAACAG AG ATTTTGTT GATGGACCTA TACAOCACAG GGCTTTACTT 660 
ATATCTQTG A CTGTCTGTAG TTTGCTCTTG GTCCTTATCA TATTATnTO TTACTTCOGG 720 
TATAAAAGAC AAGAAACCAG AGCTOGATAC AGCATTGGGTTAGAACAGGA TGAAACTTAC 780 
ATTCCTCCTG GAGAATCCCT G AG AGACTTA ATTOAGCAGT CTCAGAGCTC AGGAAGTOGA 840 
TCAGGCCIGC CTCTGCTGGT CCAAAGGACT ATAGCTAAGC AGATTCAGAT GGTGAAACAG 900 
ATTGGAAAAG GTCGCTATGQ GGAAGTTTGG ATGGSAAAGT GSCGTGGCGA AAAGGTAGCT 960 
GTG AAAGTGT TCTTCACCAC AG AGGAAGCC AGCTGGTTCA GAGAGACAGA AATATATCAG 1020 
ACAGTGTTGA TGAGGCATGA AAACATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 
GGGTCCTGGA COCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTITATGAT 1140 
TATCTGAAGT OCACCACCCT AGAOGCTAAA TCAATGCTG A AGTTAGOCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATOGAGATC TGAAAAGTAA AAACATTCTG GTG AAGAAAA ATGGAACTTQ CTGTATTGCT 1320 
GACCTGGGCC TGGCTGTTAA ATTTATTAGT GATACAAATG AAGTTGACAT ACCACCTAAC 1380 
ACTCGAGTTQ GCAOCAAAOG CTATATGCCT CCAGAAGTGT TGGAOGAGAG CTTGAACAGA 1440 
AATCACTTCC AGTCTTACAT CATGGCTG AC ATGTATAGTT TTGGCCTCAT CCTTTGGG AG 1500 
GTTGCTAGGA GATGTGTATC AGGAGGTATA GTGGAAGAAT ACCAGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATCAGGAC ATGAGGGAGA TTOTGTGCAT CAAGAAGTTA 1620 
CGCCCCTCAT TCCCAAACCG GTGGAGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 1680 
ATGACAGAAT GCTOGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GQTTAAGAAA 1740 
ACACTTGCCA AAATGTCAGA GTCCCAGGAC ATTAAACT CT GA TAGGAGAG GAAAAGTAAG 1800 
CATCTCTGCA GAAAGCCAAC AOOTACTCTT CT U 1 ITGTGG GCAGAGCAAA AGACATCAAA 1860 
TAAGCATCCA CAGTACAAGC CTTGAACATC GTCC1GCT1C CCAGTGGGTT CAOACCTCAC 1920 
CTTTCAGGGA GCGACCTGGG CAAAGACAG A GAAGCTCCCA GAAGGAGAGA TTGATCCGTG 1980 
TCTGTTTGTA CGCGGAGAAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 

SEQ ID HO:183 BCY2 Proleti seouence 

Protein Accession r. NP.001194 



1 11 21 31 41 51 
I I I I I I 

MLLRSAGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFTMEED 60 
DSGLPWTSG CLGLEGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHFTL PPLKNRDFVD 120 
GPIHHRAIXI SVTVCSLLLV LHLFCYFRY KRQETRPRYS IGLEQDETYI FPGESLRDU 180 
EQSQSSGSGS GLPLLVQRTI AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVFFTTEEAS 240 
WFRETEIYQT VLMRHENHjG F1AADKGTG SWTQLYLTTD YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSS VS GLCHLHTEIF STQGKPAIAH RDLKSKNILV KKNGTCOAD LGLAVKHSD 360 
TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HFQS YIMADM YSPGLILWEV ARRCVSGGIV 420 
EEYQLPYHDL VPSDPSYEDM REIVCDCKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 480 
RLTALRVKKT LAKMSESQDI KL 



SEQ ID NO:1B4C8F9DNA sequence 

Nucleic Acid Accession*: AC005383 

Codng Sequence: 328-275 1 (underfned sequences correspond to start and stop oodoas) 



1 11 21 31 41 si 

I I I I I I 

GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTG6GTGACC CGCGT AGAAG TGAAGTACTT 60 

TTTTATCTGC AGACCTGCGC CGATGCCGCT TTAAAAAACG CGAGOGGCTC TATGCACCTC 120 

CCTGGCGGTA CTTCCTCOGA CCTCAGCCGG GTCGGGTCOT GCCGCCCTCT CCCAGGAGAG 180 

ACAAACAGGT GTCCCACGT6 GCAGCCGCGC CCCGGGCGCC CCTCCT6TGA TCCCGTAGCG 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG CCCCCTTTCC TGTTGCTGGA GGCCGTCTGT 360 

CTTTTCCTGT TTTCCAGAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCCGC TGCROTGGAC 480 

ATCATGTTTC TGTTAGATGG GTCTAACAGC 'GTCGGGAAAG GGAGCTCTGA AAGGTCCAftG 540 

CACTTTGCCA TCACAGTCTG TGACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATICCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 



376 



WO 02/30268 



PCT/US01/32045 



CAGGAAGTGA AGGCAAGAAT CAAGAGGAlrG GTTTTCAAAG GAGGGCGCAC GGAGACGGAA 720 

CTTGCTCTGA AATACCTTCT GCACAGAGGG TTGCCTGGAO GCAGAAATGC TTCTGTGCCC 780 

CAGATCCTCA TCATCGTCAC TGATGGGAAG TCCCAGGGGG ATGTGGCACT GCCATCCAAG 840 

CAGCTGAAGG AAAGGGGTGT CACTOTCT TT GCTGTGGGGG TCAGCl'lTCC CAGGTGGGAG 900 

5 GAGCTGCATQ CACTGGCCAG CGAGCCTAGA GGGCASCACG TGCTGTTGGC TGAGCAGOTG 960 

GAGGATGOCA OCAACGGCCT CTTCAGCACC CTCAGCAGCT CGGOCATCTG CTCCAGCGCC 1020 

ACGCCAGACT GCAGGGTCGA G6CTCA00CC TGTGAGCACA GGACGCTGGA GATGGTCCGG 1080 

GAGTTCGCTG GCAATGCCCC ATGCTGGAGA GGATCGCGGC GGACCCTTGC GGTGCT6GCT 1140 

GCACACTGTC CCTTCTACAG CTGGAAGAGA GTOTTCCTAA CCCACCCTGC CACCTGCTAC 1200 

.0 AGGACCACCT GCCCAGGCCC CTGTGACTCG CAGCCCTGCC A 6AAT0 QA0G CACATGTGTT 1260 

OCAGAAQGAC TGGACGGCTA CCAGTGCCTC TGCCCGCTQG CCTTTGGAGG GCASGCTAAC 1320 

T6T6CCCTGA AGCTGAGCCT GGAATGCAGS GTCGACCTCC TCTTCCTGCT GGACAGCTCT 1380 

GCGGGCACCA CTCTG6ACGG CTTCCTGCGG 60CAAAGTCT TCGTGAAGCG QTTTGTCCGG 1440 

GCCGTGCTGA GCGAGGACTC TCGGGCCCGA GTGGGTGTGG CCACATACAG CAGGGAGCTG 1S00 

.5 CTGOTGGCGO TGCCTGTGGG GGAGTAOCAG GATGTGCCTG AOCTGGTCTG GAGCCTCGAT 15S0 

GGCATTCCCT TOCGTGGTGG CCCCACCCTG ACGGGCAGTG CCTTGCGGCA GGCG GCAGAB 1620 

CGTGGCTTCG GGAGCGCCAC CAGGACAGGC CAGGACCGGC CACGTAGAGT GGTGGTTTTG 1680 

CTCACTGA6T CACACTCCGA GGATGAGGTT G06GGCCCAO CGCGTCACGC AAGGGCGOGA 1740 

GAGCTGCTCC T6CTGG6T6T AG6CASTQA0 GOCGTGCGGG CAGAGCTGGA GOAGATCACA 1800 

10 CGCAGCCCAA AGCATGTGAT GGTCTACTCG GATCCTCAGG ATCTGTTCAA CCAAATCCCT 1860 

GAGCTGCAGG GGAAGCTGTG CAGCCGGCAG CGGOCAGGGT GOCG GACA CA AGCCCTGGAC 1920 

CTCGTCTTCA TGTTGGACAC CTCTGCCTCA GTAGGGCCCG AGAATTTTGC TCAGATGCAG 1980 

AGCTTTQTGA GAAGCTGTGC CCTOCAGTTT GAGGTGAACC CTGACGTGAC ACAGGTCGGC 2040 

CTGGT G GTGT ATGGCAGCCA GGTGCAGACT GCCTTCGGGC TGGACACCAA ACCCACCCGG 2100 

15 GCTGCGATGC TGCGGGCCAT TAGOCAGGCC COCTACCTAG CKWGGTGGG CTCAGCOGGC 2160 

ACCGCCCTGC TGCACATCTA TGACAAAGTG ATGAOCGTCC AGAGGGGTGC CCGGCCTGGT 2220 

GTOCCCAAAG CTGTGGTGGT GCTCACAGGC GGGAGAGGCG CAGAGGATGC AGCCGTTCCT 2280 

GCCCAGAAGC TGAGGAACAA TGGCATCTCT GTCTTGGTCG TGGGCGTGGQ GCCTGTCCTA 2340 

AGTGAGGGTC TGCGGAGGCT TGCAGGTCCC CGGGATTCC C TCATCCACGT OGCAGCTTAC 2400 

JO GCCGAOCTGC GGTACCACCA GGACGTGCTC ATTGAGTGGC TGTGTGGAGA AGCCAAGCAO 2460 

CCAGTCAACC TCTGCAAACC CAGCCCGTGC ATGAATGAGG GCAGCTGCGT CCTGCAGAAT 2520 

GGGAGCTACC GCTGCAAGTG TCGGGATGGC tggqagggcc CCCACTGCGA GAACCGTGAC 2580 

TGGAGCTCTT GCTCTGTATG TGTGAGCCAG GGATGGATTC TTGAGACGCC CCTGAGGCAC 2640 

ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT ACCCCTCCCA GCAACTACAG AGAAGGCCTG 2700 

55 GGCACTGAAA TGGTGCCTAC CTTCTGGAAT GTCTGTCCCC CAGGTCC TTA GA ATGTCTGC 2760 

TTCCCGCCGT GGCCAGGACC ACTATTCTCA CTGAGGGAGG AGGATGTCCC AACTGCAGCC 2820 

ATGCTGCTTA GAGACAAGAA AGCAGCTGAT GTCACCCACA AAOGATGTTG TTGAAAAGTT 2880 

TTGATGTQTA AGTAAATACC CACTTTCTGT ACCTGCTGTG CCTTQTTGAG GCTATGTCAT 2940 

CTGCCACCTT TCCCTTGAGG ATAAACAAGG GGTCCTGAAG ACTTAAATTT AGCGGCCTGA 3000 

10 CGTTCCTTTG CACACAATCA ATGCTCGCCA GAA T G TTG T T GACACAGTAA TGCCCAGCAG 3060 

AGGCCTTTAC TAGAGCATCC TTTGGAOGGC GAAGGCGACG GCCTTTCAAG ATGGAAAGCA 3120 

GCAGCTTTTC CACTTCCCCA GAGACATTCT GGATGCATTT GCATTGAGTC TGAAAGGGGG 3180 

CTTGAGGGAC GTTTGTGACT TCTT GGCGAC TGOCTTTTGT GTGTGGAAGA GACTTGGAAA 3240 

GGTCTCAGAC TGAATGTGAC CAATTAACCA GCTTGGTTGA TGATGGGGGA GGGGCTGAGT 3300 

15 TGTGCATGGG CCCAGGTCTG GAGGGCCACG TAAAATCGTT CTGAGTCGTG AGCAGTGTCC 3360 

ACCTTGAAGG TCTTC 

gga ID NOTI85 CBF9 Protein sequence 
Protein Accession I: nonetound 

50 

1 11 21 31 41 51 

I I I I I I 

HPPPLLLEAV CVFLFSRVPP SLPIiQEVHVS KETIGKISAA SKMMWCSAAV DXMFIADGSN 60 

55 SVGKGSFEKS KHFAITVCDG LDISPEKVRV GAFQFSSTPB IiEFPLDSPST CQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EBLHALASEP RGQHVLLAEQ VEDATNGLPS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGHAPCW RGSRRTLAVL AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLPLLDS SAGTTLDGFL 360 

50 RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGPGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RBLLLLGVGS -480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA S40 

SVGPENFAQH QSFVRSCALQ PEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAHUtAISQ 600 

APVtGGVGSA GTALLHIYDK VHTVQRGARP GVPKAWVLT GGRGAEDAAV PAQKLRNNGI 660 

65 SVLWGVGPV I*SEGLRRLAG PRDSLIHVAA YADLRYHQDV LIEWLCGBAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRO GWEGPHCENR EMSSCSVCVS QGWILETPLR HMAPVQEGSS 780 

RTPPSNYREG LGTEMVPTFW NVCAPGP 



70 SEQ ID NO:186 PAV1 ONAsequence 

Nuctec Add Accession R AF272890 ' 

Coding Sequence: 87-1 520 (underlined sequences cotrespond to start and slop codons) 

1 11 21 31 41 51 

75 | | | I I I 

TGCTACCCGC GCCCGGGCTT CTGGGGTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 60 
OCCX30COCCG GCCTCCGCAG CTCGGCAT5G GCGCGGGGGT GCTCGTCCTG GGCGCCTCCG 120 
RGCCCGGTAA CXTGTCGTCG GCCGCACCGC TCCCCGACGG CGCGGCCACC GCGGCGCGGC 180 
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TGCTGGTGCC CGCGTCGCC6 CCCCCCTCGT TGCTGCCTCC CGCCAGCOAA AGCCCCGAGC 240 

CGCTGTCTCA GCAjSTGGACA GCGGGCATGG GTCTGCTGAT GGCGCTCATC GTGCTGCTCA 300 

TCGTGGCGQG CAATQTGCTG GTGATCGTGG CCATCGCCAA GACGCCGCGG CTGCAGACGC 360 

TCACCAACCT CTTCATCATG TCOCTGGCCA GCGCCGACCT GGTCATGGGG CTGCTGGTGG 420 

TGCCGTTCGG GGCCACCATC GTGGTGTGGG GCCGCTGGGA GTACGGCTCC TTCTTCTGCG 480 

AGCTGTGQAC CTCAGTGGAC GTGCTGTGCG TGACGGCCAQ CATCGAGACC CTQTOTQTCA 540 

TTGCCCTGGA CCGCTACCTC GCCATCACCT CGCCCTTCCG CTACCAGAJ3C CTGCTGACGC 600 

GCGCGCGGGC GO30GGCCTC QTGTGCACCG TGTGGGCCAT CTCGGCCCTG OTGTCCTTCC 660 

TGCCCATCCT CATGCACTGG TGGCGGGCGG AGAGCGACGA GQCGCGCCGC TGCTACAACG 720 

ACCCCAAQTG CIGCGACTTC GTCACCAACC GGGCCTACGC CATCGCCTCG TCCGTAGTCT 780 

CCTTCTftCGT GCCCCTGTGC ATCATGGCCT TCGTGTACCT GCGGGTGTTC CGOGAGGCCC 840 

AGAAGCAGGT GAAGAAGATC GACAOCTGCG AGCGCCGTTT CCTCGGCGGC CCAGCGCGGC 900 

OGCCCTOGCC CTOGOCCTOG CCCGTO3CCG OGGOCGOGOC GCCGCCCGGA OCCCCGCGCC 960 

COGCCGCCGC CGCCGCCACC GCCCCGCTGG CCAACGGGCG TGCGGGTAAG CQGCGGCCCT 1020 

CGCGCCTCGT GGCCCTACGC GAGCAGAAGG CGCTCAAGAC GCTGGGCATC ATCATGGGCG 1080 

TCTTCACGCT CTGCT6GCTG CCCTTCTTCC TGGCCAACGT GGTGAAGGCC TTCCACCGCQ 1140 

AGCTGGT G CC CQACCGCCTC TTCGTCTTCT TCAACTGGCT GGGCTACGCC AACTCG6CCT 1200 

TCAACCCCAT CATCTACTQC CGCAGCCCCG ACTTCCGCAA GGCCTTCCAG GGACTGCTCT 1260 

GCTGCGCGCG CAGGGCTGCC CGCCGGCGCC ACGCGACCCA CGGAGACCGG OCGCGCGCCT 1320 

CGGGCTOTCT GGCCCGGCCC GGACCCCCGC CATCGCCCGG GGCCGCCTCG GAGGACGACG 1380 

ACGACGATGT CGTCGGGGCC ACGCCGCCCG CGCGCCTGCT GGA6CCCTG6 GCCGGCTGCA 1440 

ACGGCGGGGC 6GCGGCGGAC AGCGACTCGA GCCTGSACGA GCCGTGCCGC CCCGGCTTCG 1500 

CCTCGGAATC CAAGGTGTAG GGCCCGGCGC GGGGCGCGGA CTCCGGQCAC GGCTTCCCAQ 1560 

GGGAACGAGG AGATCTGTGT TTACTTAAGA CCGATAGCAG GTGAACTCGA AGCCCACAAT 1620 

CCTCGTCTGA ATCATCCGAG GCAAAQAGAA AAGCCACGGA CCGTTGCACA AAAAG6AAA0 1680 
TTTGGGAAGG GATGG6AGA6 TGGCTTGCTG ATGTTCCTTG TTG 



SEQ ID NO:1g7 PAV1 Pnrtein senuence 
Protein Accession*: AM1 1176 



1 11 21 31 41 51 

I I I I I I 

HGAGVLVLGA SEPGHLSSAA FLFOSAATAA RLLVPASPPA SLLPPASESP EPLSQQWTAG 60 

HGLLMALIVL LIVAGNVLVI VAIAKTPRLQ TLTNLFIMSL ASADLVMGLL WPPGATIW 120 

WGRWEYGSPF CELMTSVDVL CVTASIETLC VIALDRYLAI TSPFRYQSLL TRARARGLVC 180 

TVWA1SALVS FLPILMHWWR AESDEARRCY NDPKCCDPVT HHAYAIASSV VSFYVPLCIM 240 

APVYLHVFRE AQKQVKKIDS CBRRFLGGPA RPPSPSPSPV PAPAPPPGPP RPAAAAATAP 300 

LR1JGRAGKRR PSRLVALREQ KALKTLGI1K GVFTLCWLPF PLANWKAFH RBLVPDRLPV 360 

FPNWLGYANS AFNPIIYCRS pdprkapqgl LCCABRAARR EHATHGDRPH ASGCLARPGP 420 
PPSPGAASDD DDDDWGATP PARLLEPWAG CNGGAAADSD SSLDEPCRPG FASBSKV 



SEQ ID NO:188 BC02 DNA sequence 

Nucleic Acid Accession*. AJ400877 

Coding sequence: 81^TO(undedlnedseg)ences(MnBspimdtostart 

1 11 21 31 41 51 
I I I I I I 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 
CCGCAACCGC TGAGCCATCC ATGGGGGTCG CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 120 
CGGTGCTGCT GCTGCTGCTG CTGCTGCCGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 180 
CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGGATGTAG A TGAGTGTGCC CAAGGGCTAG 240 
ATGACTGCCA TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTGAGGACAT CGATGAATGT GGAAATGAGC 360 
TCAATGGAGG CTGTGTCCAT GACTGTTTGA ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTGATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATCTGGAC GAGTGCCTGG 480 
AGAACAATGG CGGCTGCCAG CATACCTGTC TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGGAGGG GTnTICCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATGAATAAG G ATCACGGCT GTAGTCACAT CTGCAAGGAG GCCCCAAGGG 660 
GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 
TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAGAGTG CAGCTGCCAT CCACAGTACA AGATGCACAC AG ATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTG GAGGTCACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAAACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 
ACCGCACCTG TAAGGATACT TCG ACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATOAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGGATTTA 1140 
AATTATTAAC AGATGAGAAG TCTTGCCAAG ATGTGQATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAO CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC COAGGGTACA 1260 
CCCICTATGG CTTCAOGCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 
GTCAGCAGGT CTGTGTGAAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTGAAGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAGAT 1500 
GTCACTCTGG CATTCACCTC TCTTCAG ATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 
AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTG AG CTTGAAACTA ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 
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TGAGCTGCAT CGTAAAGCGA ACCGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1860 
AGOCCGTCCA CAGGG AGCAG TTTCACCTCC AGCTCTCAGG CATG AACCTC GACGTGGCTA 1920 
AAAAGCCTCC CAQ AACATCT G AACGCCAGG CAG AGTCCTG TGGAGTGGGC CAGGOTCATG 1980 
CAGAAAACCA ATGTGTCAOT TGCAGGGCTO GOACCTATTA TGATGGAGCA GOAGAACGCT 2040 
CCATTTTATO TCCAAATCGA ACCTTCCAAA ATGAGGAAGG ACAAATGACT TGTGAACCAT 2100 
GCCCAAG ACC AGG AAATTCT GGGOOCCTG A AGACCOCAGA AGCTTGG AAT ATGTCTQAAT 2160 
OTGGAGGTCT GTGTCAACCT GGTGAATATT CTCCAGATGG CTTTGCACCT TGOCAGCTCT 2220 
CTGCCCTaaG CACGTTCCAQ CCTGAAGCTO OTCGAACTTC CTGCTTCCCC TGTGGAGQAG 2280 
GCCTTGOCAC CAAACATCAG GGAGCTACTT OCTTTCAGGA CTGTGAAACC AG AGTTCAAT 2340 
GTTCAGCTGG ACATTTCTAC AACACCACCA CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 
CATACCAGCC TG AATTIGG A AAAAATAATT GTGTTTCTTO OCCAGGAAAT ACTACGACTG 2460 
ACTTTQATGQ CTCCSCAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGQ 2520 
GAGATTTCAC TGGGTACATT GAATOCCCAA ACTACCCAGG CAATTACCCA GCCAACACCG 2580 
AGTGTACGTG GAOCATCAAC CCACCCCCCA AGCGCOGCAT OCTGATCGTG GTCCCTGAGA 2640 
TCTTCCTGCC CATAGAOGAC GACTGTCGGG ACTATCTGGT GATGCGG AAA ACCTCTTCAT 2700 
CCAATTCTGT GACAACATAT GAAACCTGCC AGACCTACGA ACGCCCCATC OCCITCACCT 2760 
CCAGGTCAAA GAAGCTQTGG ATTCAGTTCA AGTCCAATGA AGGGAACAGC GCTAGAGGGT 2820 
TCCAGGTCCC ATAOGTGACA TATG ATGAGG ACTAGCAGGA ACTCATTOAA GACATAGTTC 2880 
GAGATGGCAG GCTCTATGCA TCTGAG AACC ATCAGGAAAT ACTTAAGGAT AAGAAACTTA 2940 
TCAAGGCTCT GTTTGATOTC CICGCCCATC CCCAGAACTA TTTCAAGTAC ACAGOOCAGG 3000 
AGTCCCGAGA GATGTTTOCA AOATCGTTCA TCOGATTGCT ACGTTCCAAA GTGTCCAGGT 3060 
TmGAGACC TTACAAATGA CTCAGCCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT C TTCCT TCTO CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATCAGT GACTCATTAG AGTrCAATTT TTATAG ATAA TACAG ATATT TTGGTAAATT 3240 
GAA C1 1UU1I TTTC I 1 ILCC AGCATCGTGG ATOTAGACTO AG AATGGCTT TOAGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GGATAGATCA CGGGCTGGCT GAGCTGGACT 3360 
TTGGTCAGCC TAGGTGAGAC TCACCTGTCC TTCTGGGGTC TTACTCCTCC TCAAGGAGTC 3420 
TGTAGTGGAA AGGAGGOCAC AGAATAAGCT GCTTATTCTO AAACTTCAGC TTCCTCTAGC 3480 
GCGGCCCTCT CTAAGGGAGC OCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3340 
CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGAG ACCIGGGAGG 3600 
ACTCAGTTTC TCCACAGOCT TCTCCAGCCT GTGTGATACA AGTTTGATCC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTOGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

SEQ ID W0:ia9 BCQg Protein sequence 

Protein Accession* CAB92285 



1 11 21 31 41 51 
I I I I I I 

MGVAGRNRPG AAWAVLLLLL LLPPLLLLAG AVPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTS YK CSCKPGYQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 120 
BDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTCIHR SEEGLSCMNK 180 
DHGCSMCKB APRGSVACEC RPGFELAKNQ RDCDLTCNHG NGGOQHSCDD TADGPECSCH 240 
PQYKMHTDGR SCLEREDTVL EVTESNTT5 V VDGDKRVKRR LLMETCAVNN GGCDRTCKDT 300 
STGVHCSCPV GFTLQLDGKT CKDIDECQTR NGGCDHFCKN IVGSFDCGCK KGFKLLTDEK 360 
SCQDVDECSL DRTCDHSON HPGTFACACN RG YTLVGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGSYECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLROCGIHL 480 
SSDVmRTS VTFKLNEGKC SLKNAELFPE GURPALPEKH SSVKESFRYV NLTCSSGKQV 540 
POAPGRPSTP KEMHTVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 600 
FHLQLSGMNL DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA RERCHjCPNG 660 
TPQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYS ADGFAP CQLCALGTFQ 720 
PEAGRTSCFP CGGGLATKHQ GATSFQDCET RVQCSPGHFY NTTTHRORC PVGTYQPEFG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRRCG GELGDFTGYI ESPNYPGNYP ANTECTWtlN 840 
PPPKRRHJV VPEIFLPtED DCGDYLVMRK TSSSNSVTTY ETCQTYERPI AFTSRSKKLW 900 
IQFKSKBGNS ARGFQVPYVT YDEDYQEUE DtVRDGRLYA SENHQEILKD KKLKALFDV 960 
LAHPQNYFKY TAQESREMFP RSFIRIXRSK VSRFLRPYK 

SEQ ID 110:190 6FG1 PNA secuenee 

Nucleic Acid Accession ft AF007I70 

Coring sequence 1-1725 (underilned sequences correspond to slop cotton) 

1 11 21 31 41 51 
I I I I I I 

AAGGAGGCGG CCTGCGGGAA AAGCGACCGC AGG ACTCCTG AGAGCAGCCT CCATGAGGCC 60 
CTGGACCAGT GCATG ACOGC CCTGGACCTC TTCCTCACCA ACCAGTTCTC AGAAGCACTC 120 
AGCTACCTCA AGCCCAGAAC CAAGGAAAGC ATGTACCACT CACTGACATA TGCCACCATC 180 
CTGGAGATGC AGGCCATGAT GACCTTTG AC CCTCAGG ACA TCCTGCTTGC CGGCAACATG 240 
ATGAAGGAGG CACAGATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 
TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGCCAATTCA CTGAAGAAGA AATCCACGCT 360 
GAGGTCTGCT ATGCAG AGTG CCTGCTGCAG CG AGCAGCCC TGACCTTCCT GCAGG ACG AG 420 
AACATGGTGA GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GACCTACAAG 480 
GAGCTGGACA GCCTTGTTCA GTCCTCACAA TACTGCAAGG GTGAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TGAAGCTTGG TGTAGCGGCC TTCAACCTGA CACTGTCCAT GCTTOCTACT 600 
AGGATCCTGA GGCTGTTGGA GTTTGTGGGG TmCAGGAA ACAAGGACTA TGGGCTGCTG 660 
CAGCTGGAGG AGGGAGCGTC AGGGCACAGC TTCCGCTCTG TGCTCTOTGT CATGCTCCTG 720 
CTGTGCTACC ACACCTTCCT CACCTTCGTG CTCGGTACTG GQAACGTCAA CATCG AGGAG 780 
GCCGAGAAGC TCTTGAAGCC CTACCTGAAC CGGTACCCTA AGGGTGCCAT CTTCCTGTIC 840 
TTTGCAGGGA GGATTGAAGT CATTAAAGGC AACATTGATG CAGCCATCCG GCGTTTCGAG 900 
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CAGTGCTGTG AGGCCCAGCA GCACTGGAAO CAGTTTCACC ACATGTGCTA CTGGGAGCTG 960 
ATCTGGTGCT TCACCTACAA GGGOCACTGG AAGATGTCCT ACTTCTACGC CGACCTGCTC 1020 
AGCAAGGAGA ACTCCTGGTC CAAGGCCAOC TACATTTACA TCAAGGCCGC CTAOCTCAGC 1080 
ATGTTTGGGA AGGAGOACCA CAAGCCGTTC GGGGACGACG AAGTGGAATT ATTTCG AGCT 1140 
GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAGAGAA GTTTGCCATC 1200 
CGGAAGTCCC GGCGCTACTT CTCCTCCAAC CCTATCTCGC TCCCAGTGCC TGCTCTGGAA 1260 
ATGATGTACA TCTGGAACGG CTACGCCGTC ATTGGGAAGC AGCCGAAACT CACGGATGGG 1320 
ATACTTGAG A TTATCACTAA GGCTOAAGAG ATGCTGGAGA AAGGCCCAGA GAACGAGTAC 1380 
TCAGTGGATG ACGAGTGCTT GGTGAAATTG TTGAAAGGOC TGTGTCTGAA ATACCTGGGC 1440 
CGTCTCCAGG AGGCCGAGGA GAATTTTAGG AGCATCTCTG CCAATGAAAA GAAGATTAAA 1500 
TATGACCACT ACTTGATOCC AAACGCCCTG CTGGAGCTGG CCCTGCTGCT TATGGAGCAA 1560 
GACAGAAACG AAGAGGCCAT CAAACTTTTG OAATCTGCCA AGCAAAACTA CAAGAATTAC 1620 
TCCATGGAGT CAAGGACACA CmCGAATC CAGGCAGCCA CACTCCAAGC CAAGTCTTCC 1680 
CTAGAGAACA GCAGCAGATC CATGGTCTCA TCAGTGTCCT TGT/AQCITTG TGCAGCAGTT 1740 
CCGGGCTGGA AGACAGAGAC AGCTGGACAG AGCTOCTGAA AACATTTCAA AATACCCCCT 1800 
CCCOCTGCCC TGCCCTGCCT TTGGGGTCCA CCGGCACTCC AGTTGGATGa CACAACATAG 1860 
TGTATCCGTC CAGAAGCCGA GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGCCAAG 1920 
GGCAGAGCAG GTGGAGCCCT CTGCCTGCXX TATCACACAT ACGGGTACTT GCTI 1 1LACT 1980 
GTOATGTTTA AGAGAATGTA TGAACAGTTT ACATTTTCCT TAGAAATACA TTGATGGGAT 2040 
CACAGTTGGC TTTAAAAAOC AACAACAATC AACCACCTGT AAGTCTTTGT CTTCACCTAT 21(10 
TATCATCTGG AGGTAAATCT CTTTATATGA TGATGCCAAA GGGCAAATTG OTIICAAAT 2160 
TCAGCAAGTT CTCAGCTTGT GTGACGG AAG GTCCTTCAGA GGACCTG AGG AATGCCTGGQ 2220 
AQAGGCTAAG CCTCAGGCTT CAATGCTTCT GGGGTTGGGC ATGAGG ATGT ACACAGACAC 2280 
CCACTACCTT ACTACTCACA CTTCATTTCA CTCCrnTGT AAATTTCCAA TTTAAAAATC 2340 
AAGCAC GTCT TTTTAGTGAG ATAAAATCTC AGCTCTTCTG TAGAAAAATC AATCTCTACC 2400 
AGTAGAAAAT GCCAGGGCTT G ATGGAAGAG CTGTGTAGCC CTTTCTATGC CAAAGCCAGG 2460 
AAATTTGGGG GGCAGGAGGA G GTTC TCAGA ATOCAGTCTG T A1CUI GCT GTATGCCAAA 2520 
CTGAAACCAC TGGGAATAAT TTATGAAACA TAAAAATCTT CTGTACTTCA CTCCAAGGTA 2580 
CATTTGCTTA CTGACAGCAT TTTTGTTA AA ACTGTTATTC TTGAAAAAAA AAAAAAAAAA 2640 
AA 

SEP IP NO:191 BFG1 Protein sequence 

Protein Accession ft AAC39S82 



1 11 21 31 41 51 
I I I I I I 

MTALDLFLTN QFSEALSYLK PRTKESMYHS LTYATTJLEMQ AMMTFDPQDI LLAGNMMKEA 60 
QMLCQRHRRK SS VTDSFSSL VNRPTLGQFT EEEUAEVCY AECLLQRAAL TFLQDENMVS 120 
FDCGGIKVRN S YQTYKELDS LVQSSQYCKG ENHPHFEGGV KLG VG AFNLT LSMLPTRILR 180 
LLEFVGFSGN KDYGLLQLEE GASGHSFRSV LCVMLLLCYH TFLTFVLGTQ NVNIEEAEKL 240 
LKPYLNRYPK GAIFLFFAGR EVKGNIDA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYIYM KAAYLSMFGK EDHKPFGDDE VELFRAVPGL 360 
KLKIAGKSLP TEKFAIRKSR RYFSSNPISL PVPALEMMYI WNGYAVIGKQ PKLTDCTLH 420 
ITKAEEMLEK GPENEYSVDD ECLVKLLKGL CLKYLGRVQE AEENFRSISA NEKKKYDHY 480 
UPNALLELA LLLMEQDRNE EAIKLLESAK QNYKNYSMES RTHFRIQAAT LQAKSSLENS 540 
SRSMVSSVSL 



SEQ ID N&192 BF06 DMA seouence 

KudefcAdd Accession f. NM.032S83 

Coding sequence; 1-4044 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGACTAGGA AGAGGACATA CTGGGTGCCC AAC11TJ1 PCTG GTGGCCTCGT GAATCGTGGC 60 
ATCGACATAG GCGATGACAT GGTTTCAGGA CTTATTTATA AAACCTATAC TCTCCAAGAT 120 
GGCCCCTGGA GTCAGCAAGA GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTCCCACXXJ 180 
TGGGGGAAGT ATGATGCTGC CTTGAGAACC ATGATTCCCT TCCGTCCCAA GCCGAGGTTT 240 
CCTGCCCCCC AGCCCCTGGA CAATGCTGGC CTGTTCTCCT ACCTCACCGT GTCATGGCTC 300 
ACCCOGCTCA TGATCCAAAG CTTACGG AGT OGCTTAGATG AGAACACCAT CCCTCCACTG 360 
TCAGTCCATG ATGCCTCAG A CAAAAATGTC CAAAGGCTTC ACCGCCTTTG GGAAGAAGAA 420 
GTCTCAAGGC GAGGQATTOA AAAAOCTTCA QTGCTTCTGG TGATGCTQAO GTTCCAGAGA 480 
ACAACGTTSA TTTTCGATGC ACTTCTO3GC ATCTGCTTCT GCATTGCCAG TGTACTCGGG 540 
CCAATATTGA TTATACCAAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGGAGTGG GACTCTGCTT TGCCCTTTTT CTCTCCGAAT GTGTQAAGTC TCTGAGTTTC 660 
TCCTCCAGTT GGATCATCAA CCAACGCACA GCCATCAGGT TCCGAGCAGC TGTTTCCTCC 720 
TITGCCTrra AG AAGCTCAT CCAATTTAAO TCTGTAATAC ACATCACCTC AGGAGAGGCC 780 
ATCAGCTTCT TCACCGGTGA TGTAAACTAC CTGTTTG AAG GGGTGTGCTA TGGACCCCTA 840 
GTACTGATCA CCTGCGCATC GCTGGTCATC TGCAGCATTT CTTCCTACTT CATTATTGGA 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGTATTCATG 960 
ACAAGAATGG CTGTGAAGGC TCAGCATCAC ACATCTG AGG TCAGCGACCA GCGCATCCGT 1020 
GTGACCAGTG AAGTTCTCAC TTGCATTAAG CTQATTAAAA TGTACACATG GGAGAAACCA 1080 
TTTGCAAAAA TCATTG AAGO TATGG AAAGT CTOACnTCT GCTCCAAAOC TGGTCATGGC 1140 
ATGGCCTTCA GCATGCTGGC CTCCTTGAAT CTCCTTCGGC TGTCAGTCTT CI I l U lGCCT 1200 
ATTGCAGTCA AAGGTCTCAC GAATTCCAAG TCTGCAGTGA TGAGGTTCAA GAAGTmTC 1260 
CTCCAGGAGA GCCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTGTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAOCTGGAGA GQAACGGGCA TGCTTCTGAG GGGATGACCA GGCCTAGAGA TGCCCTCGGG 1440 
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CCAGAGGAAG AAGGG AACAG CCTGGGCCCA G AGTTGCACA AG ATCAACCT GGTGGTGTCC 1500 
AAGGGGATGA TQTTAGGGGT CTGCGGCAAC AOGOGOAGTO GTAAGAGCAO CCTGTTGTCA 1560 
GCCATCCTGG AGGAOATGCA CTTOCTCGAO GGCTCGGTGG GGGTGCAGGO AAGCCIGGCC 1620 
TATGTCCCCC AGCAGGCCTQ OATCGTCAGC GGGAACATCA GGG AGAACAT CCTCATGGGA 1680 
GGOGCATATG ACAAGGCCCQ ATACCTCCAG GTGCTOCACT GCTGCTCCCT GAATCGGGAC 1740 
CTGGAACTIC TGOCCTTTGG AGACATGACA GAGATTGGAG AGCGGGGCCT CAACCTCTCT 1800 
GGGGGGCAGA AACAGAGGAT CAGCCTGGCC CGCGCCGTCT ATTCCGACCG TCAGATCTAC 1860 
CTGCTGGACG ACOCGCTGTC TOCTGTGGAC GCCCACGTGG OGAAGCACAT TTTTOAGGAO 1920 
TGCATTAAGA AGACACICAG GGGGAAGACO GTOGTCCTGO TGAOCCACCA GCTGCAGTAC 1980 
TTAGAATTTT GTGGCCAO AT CATTTTGTTG GAAAATGGGA AAATCTGTGA AAATOGAACT 2040 
CACAOTGAGT TAATGCAG AA AAAGGGG AAA TATGCCCAAC TTATCCAG AA GATGCACAAG 2100 
GAAGGCACTT CGGACATGTT GCAGGACACA GCAAAGATAG CAG AGAAGCC AAAGGTAGAA 2160 
AGTCAGGCTC TGGCCACCTC CCTGGAAGAO TCTCTCAAGG GAAATGCTGT GCCGGAGCAT 2220 
CAGCTCACAC AOGAGGAGGA GATGGAAGAA GGCTGCTTG A GTT GGAGGOT CTACCACCAC 2280 
TACATCCAGG CAGCTGGAGG TTACATGGTC TCTIGCATAA TnTCTTCTT CGTGGTGCTG 2340 
ATCGICI 1LT TAACGATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGOA OCAGGGCTCO 2400 
GGGACCAATA GCAGCCGAGA GAGCAATGGA ACCATGGCAG ACCTGGGCAA CATTGCAGAC 2460 
AATCCTCAAC TGTCCTTCTA CCAGCTGGTG TACGGGCTCA ACGCCCTGCT OCTCATCTOT 2520 
GTGGGGGTCT GCTCCTCAGG GATTTTCACC AAAGTCACX3A GGAAGGCATC CACGGCGCTG 2580 
CACAACAAGC TCTTCAACAA GGTTTIOCGC TGGOCCATGA GTTTCTITGA CAOCATCCCA 2640 
ATAGGCCGGC TTTTOAACIXj CTTCGCAGGG G ACTTGGAAC AGCTGGAOCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTGTCC TTAATGGTGA TCGCOGTCCT GTTGATTGTC 2760 
AGTGTGCTGT CTCCATATAT CCTGITAATG GGAGCCATAA TCATOGTTAT TlGCriCATT 2820 
TATTATATGA TGTTCAAGAA GGCCATCGGT GTGTTCAAGA GACTGGAGAA CTATAGCCGG 2880 
TCTCCTTTAT TCTCGCACAT OCTCAATTCT CTGCAAGGCC TGAGCTCCAT OCATGTCTAT 2940 
GGAAAAACTG AAGACITCAT CAGCCAGTTT AAGAGGCTGA CTGATGGGCA GAATAACTAC 3000 
CTGCTJGTTGT TTCTATCTTC CACAOGATGG ATGGCATTGA GGCTGGAGAT CATG ACCAAC 3060 
CI lG'l GA CCT TGG C T GT T G C CCTG 1 1CGTG G CI 1 1 FGGCA TnCCTCCAC CCOCTACTCC 3120 
TTTAAAGTCA TGGCTG1CAA CATCGTGCTG CAGCTGGCGT CCAGCTTCCA GGCCACTGCC 3180 
GGGATTQGCT TGGAGACAGA GGCACAGTTC ACGGCTGTAG AGAGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTCGGAAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC OCAGGGGTGG 3300 
CCACAGCATG GGGAAATCAT ATTTCAGGAT TATCACATGA AATACAGAGA CAACACACCC 3360 
ACCGTGCTTC ACGGCATCAA CCTGACCATC CGCGGCCACG AAQTGGTGGG CATCGTGGGA 3420 
AGGACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCICTCT TCCGCCTGGT GGAGCCCATG 3480 
GCAGGCCGGA TICTCATTGA CGGOGTGGAC ATTTGCAGCA TCGGCCTGGA GGACTTGCGG 3540 
TCCAAGCTCT CAGTGATCCC TCAAGATCCA GTGCTGCTCT CAGG AACCAT CAGATICAAC 3600 
CTAGATCCCT TTGACCGTCA CACTGACCAG CAGATCTGGG ATGCCTTGGA GAGGACATTC 3660 
CTGACCAAGG CCATCTCAAA GTTCCCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG CCAGGGCTGT GCTTCGCAAC 3780 
TCCAAGATCA TGCTTATCGA TGAAGOCACA GCCTCCATIG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTGA AGCCTTCCAG GGCTGCACCG TGCTCGTCAT TGCCCACCGT 3900 
GTCACCACTG TGCTGAACTG TGACCACATC CTGGTTATGG GCAATGGGAA GGTGGTAOAA 3960 
TTTGATCGGC CGGAGGTACT GCGG AAGAAG CCTGGGTCAT TCTTCGCAGC CCTCATGGCC 4020 
ACAGCCACTT CTTCACTGAG ATAAGGAGAT GTGGAGACTT CATGGAGGCT GGCAGCTQAG 4080 
CTCAGAGGTT CACACAGGTG CAGCTTCGAG GCCCACAGTC TGCGACCTTC TTGTTTCGAG 4140 
ATGAGAACTT CTCCTGGAAG CAGGGGTAAA TGTAGGGGGG GTGGGGATTG CTGGATGGAA 4200 
ACCCTGGAAT AGGCTACTTG ATGGCTCTCA AGACCTTAGA ACCCCAGAAC CATCTAAGAC 4260 
ATGGGATTCA GTGATCATGT GGTTCTCCTT TTAACTTACA TGCTGAATAA TTTTATAATA 4320 
AGGTAAAAGC TTATAGnTT CTGATCTGTG TTAGAAGTGY TGCAAATGCT GTACTGACTT 4380 
TGTAAAATAT AAAACTAAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

SEP 10 Hft193 BF06 Protein sequence 

Pmleln Accession*: NPJ1S972.1 

1 II 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDMVSG UYKTYTLQD GPWSQQERNP EAPGRAAVPP 60 
WGKYDAALRT MIPFRPKFRF PAPQPLDNAG LFSYL.TVSWL TPLMIQSLRS RLDENTIPPL 120 
SVHDASDKNV QRXHRLWEEE VSRRGEKAS VLLVMLRFQR TRLIFDALLG ICPCIAS VLG 180 
PUJIPKILE YSEBQLGNW HGVGLCFALF LSECVKSLSF SSSWIINQRT AKFRAAVSS .240 
FAFEKUQFK SVHITSGEA ISFFTGDVNY LFEGVCYGPL VUTCASLVI CSISSYFHG 300 
YTAFIAILCY IXVFFLAVFM TRMAVKAQHH TSEVSDQRK VTSEVLTCK UKMYTWEKP 360 
FAKUEGMES LTPCSKPGDG MAFSMLASLN LLRLSVFFVP IAVKGLTNSK SAVMRFKKFF 420 
LQESPVFYVQ TUQDPSKALV FEEATLSWQQ TCPGIVNGAL ELERNGHASE GMTRPRDALG 480 
PEEEGNSLGP ELHKINLWS KGMMLG VCGN TGSGKSSLLS AILEEMHLLE GSVGVQGSLA 540 
YVPQQAWIVS GNIRENILMO GAYDKARYLQ VLHCCSLNRD LELLPFGDMT EIGERGLNLS 600 
GGQKQRELA RAVYSDRQIY LLDDPLSAVD AHVGKHIFEE CDCKTLRGKT WLVTHQLQY 660 
LEFCGQULL ENGKICENGT HSELMQKKGK YAQLIQKMHK EATSDMLQDT AMAEKPKVE 720 
SQALATSLEE SLNGNAVPEH QLPQEEEMEE GSLS WRVYHH YIQAAGGYMV SCIIFFFWL 780 
IVFLTIFSFW WLSYWLEQGS GTOSSRESNG TMADLGNIAD NPQLSFYQLV YGXNALLUC 840 
VGVCSSGIFT KVTRKASTAL HNKLFNKVFR CPMSFFDTIP IGRILNCFAG DLEQLDQLLP 900 
IFSEQR.VLS LMVIAVUJV SVLSPYILLM GAHMVICH YYMMFKKAIO VFKRLENYSR 960 
SPLFSHILNS LQGLSSIHVY GKTEDFISQF KRLTDAQNNY LLLFLSSTRW MALRLEIMTN 1020 
LVTtAVALFV AFG1SSTPYS FKVMA VNIVL QLASSFQATA RIGLETEAQF TAVERILQYM 1080 
KMCVSEAPLH MEGTSCPQGW PQHGEUFQD YHMKYRDNTP TVLHGINLTI RGHEWGIVG 1140 
RTGSGKSSLG MALFRLVEPM AGRIUDGVD ICSIGLEDLR SKLSVIPQDP VLLSGTIRFN 1200 
LDPFDRHTDQ QIWDALERTF LTKAISKFPK KLHTDVVENG GNFSVGERQL LOARAVLRN 1260 
SKmJDEAT ASIDMETOTL IQRTKEAFQ GCTVLVIAHR VTTVLNCDHI LVMGNGKWE 1320 
FDRPEVLRKK PGSLFAALMA TATSSLR 
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NuddcAddAccesstanl: AA9832S1 

Coding sequence: 1-1749 (undatned sequences coirespond to start and stop codons) 

1 11 21 31 41 SI 

I I I I I I 

ATGCTGTCTG GCTTCTTGAT GAGTCCCAGT ACOCAGCACA GAGCACAGTA CACTCCCGGA 60 

GGAAAGAAAC TTCCGTGGGA GGCTTCCATC GOTGCGCACA CCTCCCGAGG GCGAGGCAGC 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGGGAC06 CGCTGCAGCC 180 

GGGGAGGCGG AGAAGGGGAA CCGGGGCGAG CCGCCCGOCT GGATCCGCGC CCAGCAGCAG 240 

CCQCGGCCGC CGCCAGCTGQ GCAQGCTCCC OQGACTGCOG CTGGGGGCGC GCAGGACCCT 300 

CGCCTGCGTC GTGGACGTTC CCGGGGGAGG GTCCGGTTGC CAGTGAAACC TCCAGAGGCT 360 

TCCGGACGAC AGCCCCGGGG GOCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAGQ CAOTCCCTAA GGGGACCGGG CCACCGGCTG AGGACGGGGA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCCG GCGTCGTCGC CTCCTGGGCG TCGCGGCAGA GGGGAGTGOC 540 

CCGCGCGGAA AGCGCCGCGO GACAGTCAGT GACGAGGCCC GGGGGTCGCC GGGGCCACGA 600 

CTT C TC GCAG AOCGTCCTGC GCTCTCTGGA CAC GCGC TGT CCGOGOCCAO GGTGGTGCCA 660 

TGTGGGGGGC TCGCCGCTCG TCCGTCTCCT CATCCTGGAA CGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTO GC6GC6G6GG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 7S0 

TGGCTGGACG GGCAGCGGGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGC? ACTGCTGCTC CAGCGGCGAQ 900 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGG60GCTGG C6AGCCTGGC 960 

CGGGCGGACA AAGACGGGCC CCGAOGGCTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AGGQTGCGCC CCCACCCGTG AGGGCCTOGC AGCGGTGCTC CCCTGAAGGC 1080 

TCCCCGAAAG GAAGGCAGCT CCTCACGGCT TTCCCGGGGC TGCTGCCCCG TGCCAGAOGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTOOCC TGCAOCGGCC CGCCTTGCCC 1200 

ATCTACGTGC CGTTCCTCAT TGTTGGCTCC GTGTTTGTCG CCTTTATCAT CTTGGGGTCC 1260 

CTGGTGGCAG CCTGTTGCTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATCCCCATGA TCCCCAGTGC CAGCACCTOC 1380 

CGGGGGTCGT CCTCACGCCA GTOCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

GGGGCCCGG6 CGCCCCCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA A6GGAOCATQ 1500 

AACAACOTGT ATGTCSACAT GCCCACGAAT TTCTCTQTGC TCAACTCTCA GCAGGCCACC 1560 

CAQATTGTGC CACATCAAGG GCAGTATCTQ CATCCCCCAT ACGTGGGGTA CACGGTOCAG 1620 

CACGACTCTG TGCCCA.TGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA 6CCTGGCTAC 1680 

AGGCAGATTC AGTCCCCCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGTATAAC C6AGAGTCAC TCGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTQ 1800 

GATTCTCQAG GTGGAAGTCC GCACATGTCC GTGOTATCTA TGGCACGATT CCTTTGGATG 1860 

GCTTCATTTG CCCCCAGACT GTATGAAAAC ATCTCCGAAT TAGCATTTCT GGATATGTTT 1920 

CATCCAG6GT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTGGAGAT GACTGTGATG 1980 

TTGCTGATGG GTOTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATG TTGTTATTAT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

■ m ' i ' lTri ' IT TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGCGATC TCCGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGCCTC 2280 

AGCCTCCCAC GTGGCTGGGA TTACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCACC ATGTTGGCTG GGCTGGTCTC ACTCTCCT6A CCTCAAGCAA 2400 

TCTGCCT6TC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TTTTTTTCEA ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2520 

ATTCTAAAAG GAAACCTGTT TGAACTCTCT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACACCTT AATTTCATTG TAAAAAGAXA TATATATTTT GTCTATTTTT GTOCTTTT6G 2640 

GGGCCTATTT TGTGCTTTTT TACCTTATGT AGAGATCTTA TTACAAAGTG ATTTTCTACA 2700 

TTAAAAACAfl ACTGAAATAA ATTGTATAGT TACTTAACTA ATQAAGACAT TTCASAACTC 2760 

TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTCA TAAAACCATT CATCCCCTTC 2820 

TTGATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TCATAAACTA TCACCCGCTG CTTCTCTGAG TTACTTTTAA TTTTGCCTTQ 2940 

TGGTTATGGT TTGGCGTTTC CfH.TUW l 'U GTTTTCAGAG CCCCATGTCT AXAXAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG ATACTTAGTA 3120 

TAGCTCCTCA GOCATAACCT GAGACTTGGG ATGAAATTTA AACCAGATAC GATTTACOTT 3180 

GCAGATCATA AGGCTTTTTA TACTCTO3TT ATCAAAATGG CTTATTTTTC AGGCACTAAG 3240 

GATTGTTAAG AGAAAASCTT TTCAACGAAG GATTGC CT T T CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAQT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAACCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA AOCAAAGGCA GGGGAAAATC ATTTTACTTA TTAATAAATA TTTTATGATG 3660 
TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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Sta IP NMS5B HB8 Protein seaience 

Prateta Accession I: none found 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



HLSGFLMSPS 



SGRQPRGPSD 
PRGKRRGWS 
CCWLRCWRRG 
AELDQGGCmj 
fiPKGRQLLHA 
LVAACCCRCL 
GARAPPTRSQ 
HDSVHfTAVP 



11 
I 

TQHRAQYTPG 
PPAWIHAQQQ 
CXFRPPSASA 
DEARGSPGPR 
RGPSGSYCHG 
DRQQGAGEPG 
FPGLLPRABR 
RPKQDPQQSR 
TNCCLPEGTW 
PFKDGLQPGY 



21 
I 

GKKLPWEASI 
PKPPPAGOAP 
THKAVPKCTrG 
LLGDRPALSG 
VfLDAQGVWRI 
RADKDGPRRL 
RGFPSSPRGG 
APGGNRUJKT 
NNVYVNHPTO 
RQIQSPFFHT 



31 
I 

GAHTSHGRGS 
GTAAGGAQDP 
PPAEDGDGLG 
DALSAPRWP 
GFQCPERFDG 
GRASCLRGTQ 
PSPLQRPALP 
IPMIPSASTS 
FSVLNCQQAT 
NSBQKHYPAV 



41 



RLRPGRSRGR 
APGPRARRRR 
CGALAARPSP 
GDATICOQSC 
GDGE6APPPV 
IYVPFLIVGS 
RGSSSRQSST 
QXVPHQGQYL 
TV 



51 
I 

AGLLKDRAAA 
VRLPVKSPBA 
LLGVAAEGSG 
HPGTPURSCS 
ALRYCCSSAE 
RAWQHCSPEG 
VPVMIILGS 
AASSSSSANS 
HPPYVGYTVQ 



60 
120 
180 
240 
300 
360 
420 
480 
540 



Nucleic Add Accession t 



SEQ D KO:196 CQA5 ONA SEQUENCE 



AA0884S9 

Coding sequence 



882-1995 (underlined sequences correspond to start and stop codons) 



1 

I 

GOCCTTGGAC 
CTGAAGAAAA 
GCGOGGGGQC 
CTGGGCCAGA 
CGGCTACTGC 
TGTGCCAGCC 
ACCTCACCCC 
CTCACCCAGG 
GCGCTCATTA 
GATTCCACCT 
AGCCCTTCGA 
GCOCAGGCAC 
GOCTGCOOOC 
ACATGGGCTG 
TGGACAGTGG 
GGTCCCATCT 
AGAGGGCGCG 
CAGGACGAGG 
GTAAGCGGGO 
CTGG0CAA66 
GGCCTGCATG 
TTGCCCCACG 
GACAGCTCCC 
CTGGGGTOCT 
GAGAGGOCAC 
GGCAGGTCCC 
GGAGTACGCA 
GAAGCAGGGG 
TCACTGTGTG 
CCCGATGCGG 
ACACTGTCCC 
CCTTCCGGAG 
TGCTGCACCT 
CMCCTCCTAC 
AGCTOCTGGG 
AGGTGGACTG 
GGCTGGGGTC 
TGGOGGATCC 
GGTGACTTCA 
GAGACAGGCT 
AAAGAAATAG 
CACGAGGGGA 
GCAGAOCCTG 
QAGCAGCGTC 
GCGTGCACAC 
CAGAAGTGTC 
TTTTGTGTTG 
CTGGAATCCC 
CCCCATCTCT 
TCATAAACAC 
TAGACCCAGA 
AGAAATAAAA 



11 
I 

ACTGACATGG 
AGGAGCTGGA 
GCGACTGGTA 
GCAGAGCCAG 
CCAAGOTACA 
GGGCCCTGCC 
CGGTCTGGCA 
AGGTGACCGA 
AGCAGCTGTT 
TCA3CTAGTC 
GGGTGGGCGC 
AGTCCCGGAG 
GGCTGGTCCC 
GGGGCTCTCT 
GGTACCCCTC 
TCAGGGAAAG 
GGGCGGCTCC 
TGGCTGTAGC 
GGTGCCTGCC 
CTGAGGGACC 
TGCCTCCCAC 
TTGAGTCCCA 
AGGCAOGTCA 
GCTCACCCCC 
CTOCCTCACC 
CTTGGGTGTC 
CTGGTGGGGG 
CACGGCAACA 
TGGGGCGCAG 
GGTCAGTGOG 
ACAAGGCACC 
COCAGCTCCA 
GGTCTGCAGG 
CCTGAAGATG 
CAGGAAAGGG 
CAGCGCAGTG 
TGCCCACCAG 
TGGCATCTTT 
TCAGGAGACC 
GGCACCTCCG 
GTCCTCCCAG 
GAATTTAAAG 
CCTGGAGCCT 
CCTGGGCTCT 
TGTGATGACA 
CCCAGTTGAG 
ATCAAGTTCC 
AGCACTTGAG 
ACAARAAAAA 
CACAAGGAAA 
TACTAGAATT 
GAGATTTCTG 



21 
I 

ACTGAAGGAG 
GCAGGAGAAG 
CCAGCAGCAG 
CGCCGACTTT 
AGAGGTGGCC 
CCCGTCCTCC 
GCAGCAGACC 
GAAGAGTGAG 
TGAGGCCCGC 
CTTGTGGGCC 
CCCATCGCAC 
TGGGCGCCTT 
CGCACCGAGC 
TGAGTCCGCA 
CATGA GTTAG 
GCACTGCCCA 
GACGCGCJGTC 
TCGGACGGAC 
TGGCTGGGGA 
CTGGCTGCAG 
AGACCCTGGG 
CACAACATCC 
TAGGCAAAGC 

CAAGGAAAAC 
ACTCCCTC A G 
GGCCCTGCTC 
GCATCGATGG 
GGCCTCCGAT 

TGTCTCAGAG 
TGCTAACCTG 
GGTGTCCCAG 
GGAGTGGGCT 
TGCAGGTCCT 
GGTGGGCCAG 
GGCCTCCCCA 
ACTGGACTGG 
GCCCACATAG 
GAAAAACTGC 
TTTACAGCTT 
GCCCCGOCTG 
GCCCTAGGAC 
ATCCGCGAGG 
CCCGGAAATG 
AATCTGCCCC 
AAGGAAAAGG 
GCCAGGAGTT 
AAAAAGAAAG 
CAATACACTA 
ATCAGAGAGA 
GAAACATGAA 




CCACCCTCTC 
CCTGCCGCCC 
GCTTGACTCC 
TAGTCCGCAG 
CGTCCCCCCG 
CGCCAGGCTG 
CAAGGGCAGC 
GGAAGTAGAT 
GCCCCAGGGA 
CGGATCGGCA 
GTGATGGOCT 
TGTGAGCCTG 
CTGTTTCOCC 
ACGCCCAGCC 
GAGAACCCCC 
CCCCTGCOCA 
AGCCCAACCT 
GTTCTGCAGC 
GCGGGGTCAG 
AGGGCCCCCT 
GAGGGGCCCT 
CCCACAGCAA 
GACAGGCGCA 
3TCCAGGGGA 
GAGGGCCTGT 
TGGCAGCCAG 
CGTCTGCCTT 
AAGCAGGAGA 
AGCTGGACCC 
CTTTCAGOCT 
GAAATCAGGC 
GCAGGGTCTA 
GCTGGGCGGG 
TGCCAGTAGC 
TCTCAGGATG 
AGAGGAACAC 
AACATCTCAG 
CCAGAGCAGC 
AAAGAAAATG 
TGAGACCCAG 
ATATAAAOTA 
AAAAAA 



41 

I 

CAOSAGGACA 
TGCAGGQTTT 
TGCAGGAGCG 
GGAGGCCCCG 
GGGAGCTGCT 
CCTGCCCTGC 
TGAAGGAGCA 
AGCTGGAGCA 
AGCAGGACGG 
CCAGGGCCAG 
TGGCTGGAGA 
TTGCCAGATG 
GTTTKGGCTC 
CTACTACTGG 
TTTCCAGCGG 
CACTTCCAAC 
TTCCOGCTCA 
GGAGGGGGTG 
TAGCGGTCGG 
CGCCGGGTGG 
TCCCCCTCTT 
GCTGCCCAGG 
CGACTCAGGA 
TGTOCCCAGG 
AGGGTACAGG 
GGCCCACTCC 
GGAGGGTCCC 
CCAGGGCCCC 
TGOGTGGGGG 
CGTGTCCAGG 
GSCAGGCAGC 
CCCCACAGAG 
AGTCAGOCCA 
CATAAGGATG 
GCCCCACAGC 
GGAGAAGCCC 
TGAGGGTGCC 
CAGAACAGTG 
GGCAGCTGAA 
TGQTGTTCCG 
TAGTGAGTGG 
GGTGGCTGGC 
TCAGTCTCCG 
GTQTGCAGGT 
TTGAAATGTG 
ACCCACACCA 
CCGGGCGTGG 
CTGGGCAACG 
AGAGATCCAG 
CAGAAGCAAC 
ACAGTGTTTT 



51 
I 

CTGACATGGA 
GGAGATGATG 
CCAGCGCCGC 
CGCACTGGGG 
GGCTGCAGCC 
CCTGACGTCC 
GAACCGACIC 
GGAGAAGTCG 
GGGACCTCTG 
CCTGGCACTC 
CCCCCGGCAG 
GGCTCCCCAG 
CTGGTTGYTG 
CCGCTGTCAG 
TGCCGCCCTG 
AACGGGCAGC 
ACCAGGGCAC 
GGGACGGCCT 
ACTTCAGGTT 
GCGAGAGCTT 
GGCCGGGACG 
AGGGCCCCCA 
TTTCCAAGGC 
TTTCAGCTGG 
AGGAGGCTGG 
CGCTGGTGCT 
AGTGTCACCA 
CGATGCGGGG 
GCGCAGGGCC 
GCACTTTGGT 
GTGGCAACTC 
CCACATTCCC 
GCATGCAGCT 
TCAGGCCTGG 
CCCAGCACCC 
CCCGTCAGC& 
TGCCATGCCC 
TCTGTCCCGG 
GCGGAAATGT 
TGCAAGGTGA 
CCCTGGAGAC 
AGAGGCACAT 
TGCAGGATGT 
ACATACACGT 
TCCTTGGGGG 
GGCCTCAGGA 
TGGTTCACGC 
CAGTGAGAGA 
GTTTAAAAAT 
AGATTGACTC 
ATATATCTAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
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SEQDWM371JBG2DNA SEQUENCE 

Nucleic Add Accession f: X63S29 

Cooing sequence: S«543 (start and stop codons are unitettned) 

5 1 11 21 31 41 51 

GCGGAACACC GGCCCGCCGT CGCGGCAGCT GCTTCAOCCC TCTCTCTGCA GCCATGGGGC 60 
TCCCTCGTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTOCTGGCTG CAGTGCGCGO 120 
_ CCICCOAGCC GTQCCGGGCG GTCTTCAGGO AGGCIGAAOT GACCTTGQAG GCGGGAGGOG 180 

10 CGGAGCAGGA GCOCGGCCAG GCGCTGGGGA AAGTATTCAT GGGCTGCCCT GGGCAAOAGC 240 
CAGCTCTGTT TAGCACTGAT AATGATOACT TCACIGTGCO GAATGGCGAQ ACAGTCCAGG 300 
AAAGAAGGTC ACTOAAGGAA AGO AATCCAT TO AAGATCTT OCCATCCAAA CGTATCITAC 360 
GAAQACACAA G AGAQATTGG GTGOTTGCTC CAATATCTOT OCCTOAAAAT GGCA AGGG TC 420 
CCTTCCCCCA GAGACTOAAT CAGCTCAAOT CTAATAAAQA TAGAGACACC AAGATTTTCT 480 

15 ACAGCATCAC GGGGCCGGGG GCAGACAGCC COOCTGAGGG TGTCTTOGCT GTAGAGAAGG 540 
AGACAGGCTO U1 IGI lOllO AATAAOOCAC TGOACCGGGA GOAGATTGCC AAGTATOAGC 600 
TCTTTGGOCA CGCIGTOTCA GAGAATGOTO CCTCAGTGOA GGACCCCATG AACATCTCCA 660 
TCATCGTGAC CGACCAGAAT GACCACAAGC CCAAGTTTAC CCAGGACACC TTCOGAGGGA 720 

. . GTGTCTTAGA GGGAGTCCTA CCAGGTACTT CTGTGATGCA GGTG ACAGCC ACAGATGAGG 780 

20 ATGATGCCAT CTACACCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA OGAGCTCATQ TTCACAATTC ACCGGAGCAC AGGCACCATC AGCGTCATCT 900 
OCAGTGGCCT GGAOCXXK5AA AAAGTCOCTO AGTACACACT GACCATCCAG GCCACAGACA 960 
TOO ATGGGG A CGGCTOCACC ACCACGGCAG TGGCAGTAGT GG AG ATCCTT G ATGCCAATG 1020 
ACAATGCTCC CATGTTTQAC OCCCAOAAGT ACGAGGCCCA TGTGCCTGAG AATGCAGTGG 1080 

25 GCCATGAGGT GCAGAGGCDj AOGGTCACTG ATCTGGAOGC COGCAACTCA CCAGOGTGGC 1140 
GTGCCACCTA CCTTATCATG GGCGGTGACG ACGGOGACCA TTTTACCATC ACCACCCACC 1200 
CIGAG AGCAA OCAGGGCATC CTGACAACCA GGAAGGGTTT GGATTTTGAG GCCAAAAACC 1260 
AGCACAOCCT GTAOGTTGAA GTGACCAACG AGGCCOCTTT TGTGCTGAAG CTCOCAACCT 1320 
CCACAGCCAC CATAGTGGTC CACGTGGAGG ATGTGAATGA GGCACCTGTG TTTGTCCCAC 1380 

30 OCICCAAAGT CGTTGAGGTC CAGGAGGGCA TCCCCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAGAAGA CCCTG ACAAG GAGAATCAAA AGATCAGCTA CCGCATCCTG AGAGACOCAG 1500 
CAGGGTGGCT AGCCATOOAC CCAGACAGTG GGCAGGTCAC AGCTGTGGGC ACCCTCGACC 1560 
GTGAGGATGA GCAGTTTGTG AGGAACAACA TCTATOAAGT CATGGTCTK5 GOCATGGACA 1620 
ATGGAAGCCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT G ATGTCAACG 1680 

35 ACCATGOCCC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAAOCAAAGC CCTGTGCGCC 1740 
AOGTGCTGAA CATCAGGGAC AAGGACCTGT CTCCOCACAC CTOGCCrTTC CAGGCCCAGC 1800 
TCACAGATGA CICAGACATC TACTGGACGG CAQAGGTCAA GGAGGAAGGT GACACAGTGG 1860 
TCTTGTCCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTGTG 1920 
AOCATGGCAA CAAAGAGCAG CTGACGGTGA TCAGGGCCAC TGTGTGOGAC TGOCATGGCC 1980 

40 ATGTCGAAAC CTGOOCTGGA COCIGGAAAG GAGGTTTCAT CCTCCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTG1TOLTC CTGCrGGTGCTG LTlTlU ' lT GGTGAGAAAG AAGCGGAAGA 2100 
TCAAGGAGCC OCTCCTACTC CCAGAAGATG ACACGCGTGA CAAOGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GAOCAGGACT ATGACATCAC CCAGCICCAC CGAGGTCTGG 2220 
AGGGCAGGCC GGAGGTGGTT CTCCGCAATG ACGTGGCAOC AACCATCATC COGACAOOCA 2280 

45 TGTACCGTCC TAGGCCAGCC AACCCAGATG AAATOGGCAA CTTTATAATT GAGAACCTGA 2340 
AGGCGGCTAA CACAGACOCC ACAGCCCCGC CCTACGACAC CCTCTTGGTG TTCGACTATQ 2400 
AGGGCAGCGG CTCCGACGCC GCGTCCCTGA GCTCCCTCAC CTCCTCCGCC TCCGACCAAG 2460 
ACCAAGATTA CGATTATCTG AAOGAGTGGG GCAGCCGCTT CAAGAAGCTG GCAGACATGT 2520 
ACGGTGGOGG GGAGGAOGAC TAfiGOGGCCT GCCIGCAGGG CTGGGGACCA AAGGTCAGGC 2580 

50 CACAG AGCAT CTCCAAGGGG TCTCAGTTCC CCCTTCAGCT GAGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGCAACTT GGCGGAGACA GGCTATGAGT CTG ACGTTAG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGGATGG AGGAATGTGG GCAGTTTG AC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGOCT CAGAGGCCAA GTTTCCAGAA GCCTCTTACC TGCCGTAAAA 2820 
TGCTCAACCC TGTGTCCIGG GCCTGGGCCT GCTOTGACTG ACCTACAOTG GACTTTCTCT 2880 

55 CTGGAATGGA ACCTTCTTAG GCCTCCTGGT GCAACTTAAT UUUUU ' AATGCTATCT 2940 

TCAAAACGTT AGAGAAAGTT CTTCAAAAGT GCAGCCCAGA GCTGCIGGGC CCACTGGOCG 3000 
TCCIGCATTT CTGGTTTCCA GACCOCAATG CCTCCCATTC GGATGGATCT CIGCGTTTTT 3060 
ATACTG AGTG TGCCTAGOTT GOXCTTATT TTTTATTTTC OCTGTTGCGT TGCTATAGAT 3120 
GAAGGGTGAG GACAATCGTO TATATGTACT AGAACTTTTT TATTAAAGAA A 

60 

SEQ ID HO;198 ISQ2 Protein seaueneg 

Protein Accession!: CAA4S177 

65 

1 11 21 31 41 51 
I I I I I I 

MGLFRGPLAS LUXQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPUCIFPSKR ILRRHKEDWV VAFISVFENG 120 

70 KGPFPQRLNQ LKSNKDRDTK rPYSITGPGA DSPPEGVFAV EKBTGWLLLN KPLDREHAK 180 
YELFGHAVSB NO ASVEDPMN ISffVTDQND HKPKFTQDTF RGSVLEGVLP GTS VMQVTAT 240 
DEDDAIYTYN GWAYSMSQ EPKDPHDLMF TIHRSTGTIS VTSSGLDREK VPEYTLTIQA 300 
TDMDGDGSTT TAVAWHLD ANDNAFMFDP QKYEAHVPEN AVGHEVQRLT VTOLDAPNSP 360 
AWRATYUMG GDDGDHFTTT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 

75 PTSTATIWH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKB NQK1SYR1R 480 

DPAGWLAMDP DSGQVTAVGT LCREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTUD 540 
VNDHGPVPEP RQITKNOSP VRHVLNTTDK DLSPHTSPFQ AQLTDDSOIY WTAEVNEEGD 600 
TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TV1RATVCDC HGHVETCPGP WKGGFU-PVL 660 
GAVLALLFLL LVLLLLVRKK RKDCEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 720 
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GLEARFEWL RNDVAFTIIP TPMYRPRPAN PDEIGNFnE NLKAANTDPT APPYDTLLVF 780 
DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWGSRFKKLA DMYGGGEDD 

SEQ 10 NO:199 OBQ DNA SEQUENCE 

NwJeteAddAccesdonfc NM.012152 

Coding sequence 43-1101 (itfrierSned sequences correspond to start and stop codans) 



1 11 21 31 41 Si 

I I I I I I 

CTTCTTTAAA TTTCTTTCTA GGASGTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTQGACA 120 

GGAACAAAGC TTGTGATTGT y i lWUi iOT T GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC G6CASTCATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG QAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TT6ACT0TCA ACCGCTGQTT TCTCCGTCAG 360 

GGGCTTCTGG ACAOT AGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCGXGGAG 420 

AGGCACATGT CAATCATGAX3 GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT T GC TTGTCTQ GQCCAZOSOC ATTTTTATGQ GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCT GC CTG C TCTTCCCTGG COCOCATTTA CAGCAGGAGT 600 

TACCTTGTTF TCTGGACAGT GICCAAOCTC ATGGCCTTCC TCATCATCGT TGTGQTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA AOCAACGTCT TGTCTCCGCA TACAAGTGGQ 720 

TCCATCAGCC GCCGCABQAC ACCCATGAAG CTAATGAAGA CGGHGATGAC TGTCTTAGGG 780 

G CGTTTG TGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCC7GGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGO TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTOCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGOCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTO GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



SEQ ID NO:200 OBIS Protein sequence: 
. Protein Accession*: NP_036284 

1 11 21 31 41 51 

I I I I I I 

MNECHTOKHH DFFYNRSNTD TVDDWTGTKL VIVLCVGTPF CLPIFFSNSL VIAAVIKNRK 60 

FHFPFYYLLA NLAAADFFAG IAYVFLMFKT GPV5KTLTVN RHFLRQGLLO SSLTASLTNL 120 

LVXAVEREHS IMRMRVHSHL TKKRVTLLIL LVWAIAIFHG AVPTLGWNCI. CNISACSSLA 180 

PIYSRSYLVF WnrattKAFL MVWYXRIY VYVKRKTNVL SPHTSGSISR RRTPHKIJiKT 240 

VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKEWPLL LALUJSWNP IIYSYKBEDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYIED SISQGAVCNK STS 

SEQ ID NO201 PAA6 DNA SEQUENCE 

Nudete Acid Accessions AA569531 

Coding sequence 1-504 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGA CCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAAT TATGTT 60 

CATTCTGAAG CCAACAGGAG AAOCAAGACC AAAACTTTAT TGTCTCTGCT TTCATTTCTT 120 

GATGAAACCT CTGGACTAAG CACACATCTT OCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GTGCTTCATC TGGACATCCA CGGGAAGAAG GAAGACATQA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGGTGGTTTT ACAATATTTA AGAACCTGTG GATGA GCCTC 300 

AXACGCAGAG GGAACAAACG CTCCCCAAAA AGAGTXACAG AAACCATCCT GAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA CJATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAGGCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTAOGTTGT CCCAGCACTT CACTGGTTAA CCTTTTATGT CCACCATTTG TGGATTTCAC 600 

AGCTACTTGT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATG 660 

CCAGCTACTC tTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT 720 

TATGTAATAT CACAGACAAG GAAACTGAAC GCAGAAA1GT TTTATTTCTT GGCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG 840 
GGTCCTTTTT CACTCTGATA TGCTGCAATT AAAAAGCCAT TTCTAAGACT GT 



s W ID NO302 PAA6 Proleln semrence: 
Protein Accession I: nonelound 

1 11 21 31 41 51 

HTYSVSPFRP ELIVNHLNW HSEANRJtTKT KTLLSLLSFL DETSGLSTHL PCLSLSKECG 60 
VLHLDIHGKK EBMRITQQSS QLYLWK4GGP TIPKHLWMSL IFKGNKRSPK RVTETILRDP 120 
KQKQSSKIQE ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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Nudetc Add Accession * 
Coding sequence: 



SEQ ID NO203 PABZ DMA SEQUENCE 

XM.OS0197 

310-1971 (underfilled sequences correspond b start and stop codore) 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



i • 
I 

TCACACGTGC 
AGCOGCGCGC 
GCAGCAGQTO 
GGGGOCTGGC 
AGCAGAGCCG 
TGGCCCSCTA. 
CTCTTQCTG6 
T A TGTOCC G C 
GGCATTGGTC 
TGGCGTGGAC 
CTGAGOCTCT 
AG GC CCCTGG 
GTGTGCTTCA 
CGCCAGGCCT 
CTGCCTGCCA 
TGCCTCTTTQ 
GCTGAGGAGG 
TCGCCCCACT 

GAGCTGTGCA 
GA6GGGCT6T 
GATGAAGGCG 
TTCTCTCTGO 
AGTGTGGCAG 
GTGACAGCTT 
ACACTGGCCT 
ACTGGAGGTG 
GGAGCTCCCT 
OCCGOGCTCT 
ACCGAGGCCA 

CAGTCTCTCA 
GCIACACAGQ 
AGCACATTGG 
ATGGGGCTGC 
GCCACCCTGT 
CTCTCCOCAG 
TTATACAGGG 
ACCCAGGCTC 
GGGAGCTGAA 
CGTTTAATGT 
ACATATGAAA 
CCTCAGCCCC 
TP 



11 
I 

CAAGGGGCTG 
CTCGGGCAGG 
TTGAGCATGG 
TGATTCCTAJ3 
AGACGAAGCA 
TOGTCCAGAG 
TCAACCTGCT 
CTCTGCTGCT 
CAGTGCTGGG 
GCTATGGCCG 
TTCTCATCCC 
AGCTGGCACT 
CTCCACTGGA 
ACTCTGTCTA 
TTGACTGGGA 
GCCTGCTCAC 
CAGCGCTGGG 
GCTOTCCATG 
ACCAGCTGTG 
GCTGGATGGC 
ACCAGGGCGT 
TTCGGATGGG 
TCATGGACCG 
CTTTCGCTGT 
CAGCCGCCCT 
OOCTCTAOCA 
CTAGCAGTGA 
TCCCTAATGG 
GCGGGGCCTC 
GGGTGGTTCC 
TGTCCCAGGT 
CTGCCTATAT 
TflGTATTTGA 
GGTGGAGGGC 
CQGGCTGGCC 
GCTGCTGAGG 
TCTCTAGGGC 
AGGCCAGAAG 
AGGGTTAACA 
TAAACTCAGT 
AGCTCTTGCA 
GTTATTTGTA 
ACAGGCACTG 



21 
I 

GCTCAGCGGA 
ATCTGAGTGA 
GCTGAGAAGC 
GCAGTTGGGG 
GTTCTGGAGT 
GCTGTGGQTG 
AACCTTTGGC 
GGAAGTGGGG 

CCGCCGGCCC 
AAGGCCCGGC 
GCTCATCCTG 
G6C0CTGCTC 
TGCCTTCATG 
CACCAGTGCC 
CCTCATCTTC 
CCCCACCGAG 
CCGGGOOCGC 
CTGCGGCATG 
ACTCATGACC 
GCCCAGAGCT 
CAGCCTGGGG 
GCTGGTGCAG 
GGCTGCCGGT 
CACCGGGTTC 
CCGGGAGAAG 
GGACAGCCTG 
ACACGTGGGT 
TGCCTCTGJAT 
GGGCCGGGGC 
GGCCCCATCC 
GGTGTCTGCC 
CAAfiAGCGAC 
CTGCCTCACT 
GCCAGTTTCT 
TGCGTAGCTG 
TGCCTGACTG 
GGCTCCATGC 
GCTAGCCTCC 
CACCTGGTTT 
TGGGAGTTTC 
GGGOAACAGT 



31 
I 

ACCAGCCTGC 
TGAGACGTGT 
TGGACCGGCA 
GCAGCAADOA 
GCCTGAACGG 
AGCCQQCTGC 
CTGGAGGTGT 
GTAGAGGAGA 
GTCCCGCTCC 
TTCATCTGGG 
TGGCTAOCAO 
GGCGTGQGGC 
TCTGACCTCT 
ATCAGTCCTG 
CTGGCCCCCT 
CTCACCTCCG 
CCAGCAGAAG 
TTGGCTTTCC 
CCCCGCACCC 
TTCACGCTGT 
GAGCCGGGCA 
CTGTTCCTGC 
CGATTCGGCA 
GCCACATGCC 
ACCTTCTCAG 
CAGGTGTTCC 
AIGACCAGCT 
GCTGGAGGCA 
GTCTCCGTAC 
ATCTGCCTGG 
CTGTTTATGG 
GCAGGCCTGG 
TTGGCCAAAT 
GGGTCCCAGC 
GTTGCTGCCA 
CACAGCTGGG 
GAGGCCTTCG 
ACTGGAATGC 
TAGTTOAGAC 
CCCATCTCTA 
TAGGATGAAA 
CCTGAGGGGC 
CTNGANTCCA 



41 

I 

ACCCGCTCGC 
CCCCACTGAG 
CCAAAGGGCT 
GGAGAGGCCG 
CCOCCTGAGC 
TGCGGCACCG 
GTTTGGCOGC 
AGTTCATGAC 
TAGGCTCAGC 
CACIGTOCTT 
GGCTGCTGTG 
TGCTGGACTT 
TCCGGGACCC 
GGGGCTGCCT 
ACCTGGGCAC 
TAGCAGCCAC 
GGCTGTOGGC 
GGAACCTGGG 
TGCGCCGGCT 
TTTACACGGA 
CCOAGGCCCG 
AGTGCGCCAT 
CTCGAGCAGT 
TQTCCCACAG 
CCGTGCAGAT 
TGCCCAAATA 
TCCTGCCAGG 
GTGGCCTGCT 
GTCTGGTGGT 
ACCTCGCCAT 
GCTCCATTGT 
GTCTGOTCGC 
ACTCAGCGTA 
TCCCCGCTCC 
AAGTAATGTG 

AAGGGGGTTT 
GGGGACTCT6 
ACACCTAGAQ 
AGCCCCTTAA 
CACTCCTCCA 
AACACACAAG 
CCOCCCCCCT 



SI 
I 

TCCGGGTGAC 
GTGCCCCACA 
GGCAGAAATG 
CAGCTTCTGO 
CCTACCCGCC 
GAAAGCCCAG 
AGGCATCACC 
CATGGTGCTG 
CAGTGACCAC 
GGGCATCCTG 
CCCCGATCCC 
CTGTGGCCAG 
GGACCACTGT 
GGGCTACCTC 
CCAGGAGGAG 
ACTGCTGGTG 
CCCCTCCTTG 
CGCCCTGCTT 
CTTCGTGGCT 
TTTGGTGGGC 
GAGACACTAT 
CTCCCTGGTC 
CTATTTGGCC 

CCTGCCCTAC 
CCGAGGGGAC 
CCCTAAGCCT 
CCCACCTCCA 
GGGTGAGGCC 
CCTGGATAGT 
CCAGCTCAGC 
CATTTACTTT 
GAAAACTTCC 
TCTTAGCCCC 
GCTCTCTGCT 
TCCCTCTCCT 
CAGTCTGGAC 
CAGGTGGATT 
AAGGGTTTTT 
CCTGCAGCTT 
TGGGATTTGA 
AACCAGGTCC 
CTTTACCCTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 



SEP 10 HO^M PAB2 Protein sequence: 
Protein Accession t. XP.050197 



MVQRLWVSRL 
PVLGLVCVPL 
ELALLILGVG 
IDWDTSALAP 
CCPCBARLAP 
YQCVPRAEPG 
AFPVAAGATC 
ASSEDSLHTS 
RWPGRGICL 
WFDKSDLAK 



11 
I 

LRHRKAQIXt. 
USSASDHWRG 
LLDFCGQVCP 
YLGTQEBCLF 
RHLGALLPRL 
TEARRHYDEG 
LSHSVAWTA 
PLPGPKPGAP 
DLA1LDSAPL 
YSA 



21 
I 

WLLTFGLEV 
RY6RRRPFIW 
TPLEALLSDL 
GLLTLIFLTC 
HQLCCRHFRT 
VRHGSLGLFL 
SAALTCFTFS 
PPNGHVGAGG 
LSQVAPSLFK 



31 
I 

CLAAGITYVP 
ALSLGILLSL 
FRDPDHCRQA 
VAATLLVAEB 
LRRLFVAELC 
QCAISXVFSL 
ALQILPVTLA 
SGLLPPFPAL 
GSIVQLSQSV 



41 
I 

PLLLEVGVBE 
FLIPRAGWLA 
YSVYAFMISL 
AALGPTEPAE 
SWMALMTFTL 
VHDRLVQRFG 
SLYHREKQVP 
CGASACDVSV 
TAYKVSAAGL 



51 
I 

KFMTHVLGIG 
GLLCPDPRPL 
GGCLGYLLPA 
GLSAPSLSPH 
FYTDFVGEGL 
TRAVYLASVA 
LPKYRGDTGG 
RWVGBPTEA 
GLVAIYFATQ 



Nucleic Add Accession fc 
Coding sequence: 

1 11 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID NO205 PAJ3 DNA SEQUENCE 

AK002126 

1-1593 (underlined sequences correspond to start and stop oodons) 



41 



51 



21 31 

I I I I I I 

ATGGTTCGOC GOG G GCTGCT TGCGTGGATT TCCCGGGTGG TGGTTTTGCT GGTGCTCCTC 
TGCTGTGCTA TCTCTGTCCT GTACATGTTG GCCTGCACCC CAAAAGGTGA CGAGGAGCAG 
CTGGCACTGC CCAGGGCCAA CAGCCCCACG GGGAAGGAGG GGTACCAGGC CGTCCTTCAG 
GAGTGGGAGG AGCAGCACCG CAACTACGTG AGCAGCCTGA AGCGGCAOAT CGCACAGCTC 



60 
120 
180 
240 
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AAGGAGGAGC TGCAGGAGAG GAGTGAGCAG CTCAGGAATG GGCAGTACCA AGCCAGCGAT 300 

GCT GC TQGCC TGGOTCTGGA CAGGAGCCCC CCAOAGAAAA CCCAGGOCGA CCTCCTGGCC 360 

TTCCTGCACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

TATGCAGCAG TGCCTTTCGA TAGCTTTACT CTACAGAAGG TGTACCAGCT GGAQACTGGC 480 

CTTACCCGCC ACCCCGAGOA GAAGCCTGTO AGGAAGGACA AGCGGGAXGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACAGCCC CAATCAOCGT 600 

CCTTACACGG CCTCTGATTT CATAOAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 780 

ATCAATGTTA TCGTGCCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 840 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAAXA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

TTCACATCTG AATTCCTCAA TACGTQTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TOTTCAGTCA GTACAATCCT GGCATAATAT ACGGCCAOCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGOAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGGGATGA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGQAC 1320 

ATCAAAGGCT GGGGCGGAGA GGATGTGCAC CTTTATCGCA ACTATCTCCA CAGCAAOCTC 1380 

ATAGTGGTAC GGACGOCTGT GCGAGGACTC TTOCACCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGCAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



SEP ID NO:206 PAJ3 Protein sequence: 
Protein Accession #: NP.060841 

1 11 21 31 41 51 

I I I I I I 

MVRRGLLAWI SKVWLLVLL CCAISVLYML ACTPKGDEEQ LALPRANSPT GKEGYQAVLQ 60 
EWEEQHRNYV SSLKHQIAQL KEELQBRSEQ LRNGQYQASD AAGLGLDRSP PEKTQADLLA 120 
PLHSQVDKAE VNAGVKLATE YAAVPFDSPT LQKVYQLETG LTRHPEEKPV RKDKRDELVE 180 
AIESALETLH NPAQJSPHHR PXTASOPIEG IYRTEKDKGT LYELTPKGDH KEEFKRLILF 240 
RPFGP1HKVK NEKLN&ANTL INVTVPLAKR VDKPRQFMQN FREMCIEQDG KVHLTWYFG 300 
KEEINEVKGI LENTSKAANF RNFTPIQLNG EFSRGKGLDV GARFWKGSNV LLFFCDVDIY 360 
PTSEFLNTCR LNTOPGKKVF YPVLPSQYNP GIIYGHHDAV PPLEQQLVIK KETGPHRDPG 420 
PGMTCQYRSD FINIGGFDLD IKGWGGEDVH LYRKYLHSNL IWRTPVRGL FHUJHEKFXM 480 
DELTFEQYKM CMQSKAMNBA SHGQLGMLVF RHHEAHLRK QKQKTSSKKT 

SEQ ID NO207 PAJ5 0NA SEQUENCE 

Nucleic Add Accession t. AFI89723 

Coding sequence: 1-2712 (underfilled sequences correspond to start and stop codans) 



1 11 21 31 41 51 

I I I I I I 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAA1GGTCTA AACAAATGTO AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAAIGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTPA AAAATCCCCT TATTATGCTG CTTCTGGCTT CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTCGCAA TACTTATCGT TGTTACAGTT 300 

GCCTTTGTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAAXGCC ATTGTGTGCG TGAAGGAAAA ITGGAGCATA CACTTGCCCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGOCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT ACGCTTGTTT 480 

GAGGCTGTGG ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GCCT T GTTCT 540 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGGT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TCGAACAGGA 660 

GAAAATTCTG AATTTGG GG A GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

CCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTATA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTTA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TCCCCATTGT GGTCACAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TGGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

AGAAACAATA CTCTAATGG3 GAAGCCAACA GAAGGGGCCT TAATTGCTCT TGCAATGAAG 1260 

ATGGGTCTTG ATGGACTTCA ACAAGACTAC ATCAGAAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 1380 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CCAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCA3 AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGQACAGCTG 1560 

ACATTTCTTG GCTT G GTG G G AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGCCAGTCG TCTGGGATTO TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATGGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

TTTTACAGAG CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAACGGT 1860 

TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAG TTGCTCTGAA GGCTGCAGAC 1920 
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ATTGGAGTTO CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 1980 

CTftGTGGRTG ATGATTTTCA AACCATAATG TCTGCAATCG AASAGGGTAA AGGGATTTAT 2040 

AAIAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTAIAGCAGC ATTAACTTTA 2100 

ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA GATTTTGTGG 2160 

ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGOCTTG GAGTAGAACC AGTGGATAAA 2220 

GATGTCATTC OTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAAOTTGATA 2280 

CTTAAAATAC TTCTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTTCTGGCGT 2340 

GAGCTACGAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 2400 

TTTTTTGACA TGTTCAATGC ACTAASTTCC AGATCCCAGA OCAAOTCTGT GTTTGAGATT 24£0 

GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCABTTCTTO GATCCATCAT GGGACAATTA 2520 

CTAGTTATTT ACTTTCCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGCCT AAGCATACTG 2580 

GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTGGCAGA AATTATAAAG 2640 

AAGGTT6AAA GGAGCAGGGA AAAGATCCAG AAGCATGTTA GTTCGACATC ATCATCTTTT 2700 
CTTGAAGTAT GA 

SEQ B) HthgQB PA J5 Protein seauencg 
Protein Accession*: AAF27813 

1 11 21 31 41 51 

I I I I I I 

KtPVLTSKKA SBLPVSEVAS ILQADLQKGL NKCEVSHKRA PHGWtJEFDIS RDEPLWKKYI 60 

SQFKNPLIHL LLASAVISVI. HHQFDDAVSI TVAILIW1V AFVQEYRSEK SLEELSKLVP 120 

FECHCVREGK LEHTIARDLV PGDTVCLSVG DRVPADLRLP EAVDLSIDES SLTGETTPCS 180 

KVTAPQPAAT NGDLASKSNI AFHQTLVBCG KAKGWIGTQ EHSEPQEVFK KHQAEEAPKT 240 

PLQKSMDLLG KQLSFYSFGI IGIIMLVGWL LGKDILEKPT ISVSLAVAAI PEGLPIWTV 300 

TLALGVMRKV KKEAIVKKLP XVETXiGCCNV ICSDKTGTIiT KKEMTVTKIP TSDGLHAEVT 360 

GVGHJQFGEV XVDGDWHGF YNPAVSRIVE AGCVCKDAVI RNNTLHGKPT EGALIALAMK 420 

KGLDGLQQEY IRKAEITPFSS EQKWMAVKCV HRTQQDRPEI CFHKGAYEQV IKYCTTYQSK 480 

GQTLTLTQOQ RDVYQQEKAR KGSAOLRVLA LASGPELGQL TPLGLVGIID PPRTGVKEAV 540 

TTLIASGVSI KMITGDSQET AVAIASRLGL YSKTSQSV5G EEIDAMDVQQ LSQIVPKVAV 600 

FYRASPRHKK KIIKSLQKNQ SWAMTGDGV KDAVALKAAD IGVAMGQTGT DVCKEAADMI 660 

LVDDDFC/riK SAIEEGKGIY NNIKNFVRPQ LSTSIAALTL ISLATLMNFP NPLNAHQILW 720 

INIIMDGPPA QSLGVEPVDK DVHOCPPBHW KDSILTKNLI LKILVSSIII VCGTLFVFWR 780 

ELRDNVITFR DTTMTFTCFV FFDMFNALSS RSQTKSVFEI GLCSNRMFCY AVLGSIKGQL 840 

LVTYFPPLQK VFQTESLSIL ELLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSP 900 
LEV 

SEQ 10 NO209 PAV4 VARIANT 1 DNA SEQUENCE 

Nucte Add Accession*: N62096 

Coding sequence 1-1 284 (underlined sequences correspond to slart and stop cottons) 



1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGGATTGJCC TTATTCAATG . 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC " 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AAOGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTZA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTOC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT 'GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG IGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTAtfG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATQTTCT ACTGCTTTCC TGACAATTTC 1200 

TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 



SEQ 10 N0:21 0 P AV4 Variant 1 Protein senuenot 
Protein Accession tr. none found 

1 11 21 31 41 si 

I I I I I I 

MGYQRQEPVI PPQRGLPYSM KQAGFPLGIL LLFWVSYVTD FSLVLLIKGG ALSGTDTYQS 60 
LVNKTFGPPO YLLLSVLQPL YPFIAMISYN IIAGDTLSKV PQRIPGVDPE NVFIGRHFII 120 
GLSTVTFTLP LSLYRNIAKL GKVSLISTGL TTLILGIVHA RAISLGPHIP KTEDAWVFAK 180 
PHAIQAVGVH SFAFICHHNS FLVYSSLBEP TVAKWSRLIH HSIVISVFIC IFFATCGYLT 240 
PTGFTQGDLP ENYCRNDDLV TFGRFCYGVT VILTyPKECP VTREVIANVF FGGNLSSVFH 300 
IWTVHVITV ATLVSLLIDC LGIVLBLNGV LCATPLIPII PSACYLKLSE EPRTHSDK1M 360 
SCVKLPIGAV VHVFGFVMAI TSTQDCTHGQ EMFYCPPDNP SLTNTSESHV QQTTQLSTLH 420 
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, SEQ R> Nft211 PAV4 VARIANT2 DNA SEQUENCE 

5 Nudeic Add Accession* NS2096 

Coding sequence 1-1203 (underlined sequences correspond to start end stop cottons) 

1 11 21 31 41 SI 

10 I | | II | 

ATGG GCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGT TTTCCCTTGT TTTATTGATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTOT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT OCCAGGAGTT 240 

GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTO GAAAGGTCTC OCTCATCTCT 3G0 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACAXACCAA AAACACAAGA CGCTTGGGTA TTTGCAAAGC CCAATGCCAT TCAAGOGGTC 480 

GCGSTTATQT CTTTTGCATT TATTTGCCAC CA2AACTCCT TCTTAGTTTA CAGTTCTCTA 540 

GAAGAACCCA CAGTAGCTAA GTGGT0CC6C CTTATCCATA TGTCCATCQT GATTTCTGTA 600 

TTTATCTGTA 7ATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGQTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTOCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTOTAG CCACGCTTGT GTCATTGCTG ATTGATTOCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTOTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1OB0 

CAXGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 



15 
20 
25 
30 



SEQ ID N0312 PAV4 Variant 2 Protein sequence: 
jj Protein Accession #: nonatound 

1 11 21 31 41 51 

An I I I 1 I I 

4U HCTQRQEFVI PPQFSLVLXjI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QPLYPFIAMI 60 

SYNI1AGDTL SKVFQRIPGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGXVSLIS 120 

TGLTTLILGI VMARAISLGP HIPKTEDAWV FAKPNAIQAV GVMSFAFICH HNSFLVYSSL 180 

EEPTVAKWSR LIHHSIVISV FICIFFATCG YDTFTGFTQG DLFENYCRKD DLVTFGRFCY 240 

. _ GVTVIL1YPM ECFVTREVIA NVFFGGNLSS VFHIWTVHV ITVATLVSLL IDCLGIVLEL 300 

45 NSVLCATPLI FIIPSACYLK LSEEPF.THSD KMSCVMLPI GAWHVFGPV MAMMTQDCT 360 
HGQEMFYCFP DNFSLTOTSE SHVQQTTQLS TLNISIFQLE 

SEQ ID NO:213 PAV4 VARIANT 3 DNA SEQUENCE 

Nudeic Add Accession* NS2096 
50 Coding sequence: 1-1 140 {unowned secuenceseoniespond to start and stop codons) 



55 
60 
65 
70 
75 



1 11 .21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGG TCAATAAAAC TTTCGGCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATT3CCAAT 720 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGAGTAA 



SEQ ID NCMM PAV4 Variant 3 Protein seouence: 
Prateln Accession I: nonatound 



1 11 21 31 41 51 

80 | | | i | | 
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MGYQHQEPVI PPQVKKTFGF PGYLLLSVLQ FLYPFIAHIS YNIIAGDTLS KVFQRIPGVD 60 

PEMVFIGRHF IIGLSTVTPT LPLSLYFNXA KLGKVSLXST GLTTLILGZV HARAISLGPH 120 

IPKTEDAWVF AKPHAIQAVG VMSFAFICHH NSFLVYSSLE EPTVAKWSKt IHMSIVISVF 180 

ICIFFATCGY LTFTGFTQGD LFENYCRNDD IiVTFGRPCYG VTVILTYPME CFVTREVIAN 240 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLELN GVLCATPLIF IIPSACYLKI. 300 

SEEPRTHSDK IMSCVHLPIG AWHVFGFVM AITNTODCTH GQEHFYCPPD NFSLTNTSES 3S0 
HVOQTTQLST LNISIFQLE 



SEQ ID N0215 PAV4 VARIANT 4 DNA SEQUENCE 

Nucleic Add Accession*: NS2096 

Coding sequence: 1-1389 (underflned sequences correspond to start aid stop codtms) 



1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCC6CAGA GAGATTTAGA TGACAGAGAA . 60 

ACOCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAftT 120 

GTTOTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 

GGGTTTCCTT TGGGAATATT GCTTTTATTC TGGGTTTCAT ATGTTACAGA CTTTTCC C T T 240 

GTTTTATTGA •EAAAAGGAGG GGCCCTCTCT GGAACAGAIA CCTACCAGTC TTTGGTCAAT 300 

AAAACTTTCG GCTTTCCAGO GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 360 

ATAGCAATGA TAAGTTACAA TATAATAGCT GGAQATACTT TGAGCAAAGT TCTTCAAAGA 420 

ATCCCAGGAQ TTGATCCTGA AAACQ T Q TT T ATTGGTCGCC ACTTCATTAT TGGACTTTCC 480 

ACAGTTACCT TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 540 

TCCCTCATCT CTACAGGTTT AACAACTCTQ ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 

TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 

ATTCAAGGGG TCGGGGTTAT GTCTTTTGCA TTTATTTGCC ACCATAACTC CTTCTTAGTT 720 

TACAGTTCTC TAGAAGAACC CACAGTAGCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 

GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 840 

TTCACCCAAG GGGACTTATT TGAAAATTAC TGCAGAAATG ATGACCTGGT AACATTTGGA 900 

AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CCAATGTGTT TTTTGGTGGG AATCTTTCAT CGGTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCATTGC TGATTGATTC CCTCGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAAIACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



SEQ 10 N0316 PAV4 Valiant 4 Protein sequence: 
Pratetn Accession tr. none found 



1 11 21 31 41 51 

I I I I I I 

KGYQRQEPVI PPQRDLDDRE TLVSEHEYKE KTCQSAALFK WHSIIGSGI 1GLPYSMKQA 60 

GFPLGILLLP WVSYVTDFSL VLLIKGGALS GTDTYQSLVN KTFGPPGYLL LSVLQFLYFF 120 

IAMSMJIIA GDTLSKVFQR ZPGVDPENVF IGRHFIIGLS TVTFTLPLSL YRNIAKLGKV 180 

SLISTGLTTL ILGIVHARAI SLGPHIFKTE DAWVFAKPHA IQAVGVHSFA FICHHHSFLV 240 

YSSLBEPTVA KWSRI.IHMSI VISVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECFVTR EVIANVFFGG NLSSVFHIW TVKVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIPIIPSA CYLKLSEEPH THSDKIMSCV MLPIGAWHV FGPVMAITNT 420 
QDCTHGQEMF YCFPDKFSLT KTSESHVQQT TQLSTLWSI FQ 

SEQ 10 N0-.217 PAV3 DNA SEQUENCE 

Nucleic Add Accession <: NM.017636 

Coding sequence: 1-3501 (underlined sequences correspond to slat and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGOAGGATG CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAGAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 120 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCT6GTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACA6GAGCCT GGATTQTCAC TGGGGGTCTG CACACGGGCA TCGGCCGGCA TGTTGGTGTG 360 

GCTCTACGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTOTGGTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGQ TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTCCTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAGAAGA CGGGCGTGGG AGGGACTGGA 660 

ATB3ACATCC CTGTCCTGCT CCTCCTOATT GATGGTGATG AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CTCCTCGTGG CTGGCTCAGG GGGAGCTGCG 780 

GACTGCCTGG CGGAGACCCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA 840 

GGCGAAGOCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACCCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTCAGG AAXTOGAGAC CATAGTTTTG AAGGCCCTTG TGAAGGCCTG TGGGAGCTGG 1020 

GAGGCCTCAG CCTACCTGGA TGAGCTGCGT TTGGCTGTGG CTTGGAACCG CGTGGACATT 1080 

GCCCAGAGTG AACTCTTTCG GGGGGACATC CAATGGCGGT CCTTCCATCT CGAAGCTTCC 1110 

CICATGGAOG CCCTGCTGAA TGACCGGCCT GA6TTCGTGC GCTTGCTCAT TTCCCACGGC 1200 

CTCAGCCTGG GCCACTTCCT GACCCCGATG CGCCTGGCCC AACTCTACAG CGCGGCGCOC 1260 

TOCASCTCGC TCATCCGCAA CCTTTTGGAC CAQGCOTCCC ACAGGGCAGG CACCAAAGCC 1320 

CCAGCCCTAA AAGGGGGAGC TGCGGAGCTC CGGCCCCCTG ACGTGGGGCA TGTGCTGAGG 1380 

ATGCTGCTGG GGAAGATGTG CCCGCCGAGG TACCCCTCCG GGGGCGCCTG GGACCCTCAC 1440 

CCAGGCCAGG GCTTOQGG SA GAGCATGTAT CTGCTCTOGG ACAAGGCCAC CTCGCCGCTC 1500 

TCGCTGGATQ CTGGCCTCGG GCAGGCCCCC TGGAGCGACC TGCTTCTTTG GGCACTGTTG 1560 

CTGAACAGGG CACAGATGGC CATGTACTTC TGGGAGATGG GTTCCAATGC AGTTTCCTCA 1620 

GC T CTTGGGG CCTQTTTGCT GCTCCGGGTG ATGGCACGCC TGGAGCCTGA CGCTGAGGAG .1680 

GCAGCACGGA GGAAAGACCT GGCGTTCAAG TTTGAGGGGA TGGGCGTTGA CCTCTTTGOC 1740 

GAGTCCTATC GCAGCAGTGA GGTGACGGCT GCCCGCCTCC TCCTOCGTCG CTGCCCGCTC 1800 

TGGGGGGATG CCACTTGCCT CCAGCTGGOC ATGCAAGCTG AOGCCCGTGC CTTCTTTGCC 1860 

CAGGATGGGG TACAQTCTCT GCTGACACAO AAGTGGTGGS GAGATATGGC CAGCACTACA 1920 

CCCATCTGGG CCCTGGTTCT CGCC TT C TTT TGCCCTCCAC TCATCTACAC OCGCCTCATC 19B0 

ACCTTCAGGA AATCAGAAGA GGAGCCCACA CGGGAGGAGC TAGAGTTTGA CATGGATAGT 2040 

GTCATTAATG GGGAAGGGCC TGTCGGGACG GGGGACOCAG CCGASAAGAC GCCGCTGGGG 2100 

GTCCCGCGCC AGTCGGGCCG TCCGGGTTGC TGCGGGGGCC GCTGOGGGGG GCGCCGGTGC 2160 

CTACGCCGCT GGTTCCACTT CTGGGGCGOG CCCGTOACCA TCTTCATGGG CAACGTGGTC 2220 

AGCTACCTGC TCTTCCTGCT GCTTTTCTOG CGGGTGCTGC TCGTGGATTT OCAGCCGGOG 2280 

CCGCCCGGCT CCCTGGAGCT GCTGCTCTAT TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

CTGCCCCAGG GCCTGAGCGG AGGCGGGGOC AGCCTOGCCA GCGGGGGCCC CGGGCCTGOC 2400 

CATGCCTCAC TGAGCCAGCG CCTGCGCCTC TACCTCGCCG ACAGCTGGAA CCAGTGCGAC 2460. 

CTACTGGCTC TCAOCTGCTT CCTCCTGGSC GTGGGCTGCC GGCTGACOCC GGGTTTGTAC 2520 

CAOCTGGGCC GCACTGTCCT CTGCATCGAC TTCATGGTTT TCACGGTGCG GCTGCTTCAC 2580 

ATCTTCACGG TCAACAAACA GCTGGGGCOC AAGATOGTCA TCGTGAGCAA GATGATGAAG 2640 

GACGTGTTCT TCTTCCTCTT CTTCCTCGGC GTGTGGCTGG TAGOCTATGG CGTGGCCACG 2700 

GAGGGGCTCC TGAGGCCACG GGACAGTGAC TTCCCAAGTA TCCTGCGCCG CGTCTTCTRC 2760 

CGTCCCTACC TOCAGATCTT CGGGCAGATT CCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

GAGCACAGCA ACTGCTCGTC GGAGCOCGOC TTCTGGGCAC AOCCTCCTCG GGCCCAGCCG 2880 

GGCACCTGCG TCTCCCAGTA TGCCAACTGG CTGGTGGTGC TGCTCCTCGT CATCTTCCTG 2940 

CTCGTGGCCA ACATCCTGCT GGTCAACTTG CTCAOTGCCA TGTTCAGTTA CACATTCGGC 3000 

AAAGTACAGG GCAACAGCGA TCTCTACTGG AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTCCACTCTC GGCCCGCGCT GGCCCCGCCC TTTATCGTCA TCTCCCACTT GCGCCTCCTG 3120 

CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC CCCCAGCCGT CCTCCCCGGC CCTCGAGCAT 3180 

TTCCGGGTTT ACCTTTCTAA GGAAGCCGAG CGGAAGCTGC TAACGTGGGA ATCGGTGCAT 3240 

AAGGAGAACT TTCTGCTGGC ACGCGCTAGG GACAAGCGGG AGAGCQACTC CGAGCGTCTG 3300 

AAGCGCACGT CCCAGAAGGT GGACTTGGCA CTGAAACAGC TGGGACACAT CCGCGAGTAC 3360 

GAACAGCGCC TGAAAGTGCT GGAGCGGGAG GTCCAGCAGT GTAGCCGCCT CCTGGGGTGG 3420 

GTGGCCGAGG CGCTGAGCCG CTCTGCCTTG CTGCCCCCAG GTGGGCCGCC ACCCCCTGAC 3480 
CTGOCTGGGT CCAAAGACTG A 



$eq ip goaia pay? preien vmrmr, 

Protein Accession* none found 

1 11 21 31 41 51 

I I I I I I 

HEDAFGAAW TVHDSDAHTT EKPTDAYGEL DFTGAGRKHS NFLRLSDRTD FAAVYSLVTR 60 

TWGFRAPNLV VSVLGGSGGP VLQTWLQDLL RRGLVRAAQS TGAMIVTGGL BTGIGRHVGV 120 

AVBDBQHAST GGTKWAMGV APWGWRNRD TLINFKGSFP ARYRWRGDPE DGVQPPLDYN 180 

YSAPFLVDDG THGCLGGENR FRLRLESYIS OQKTGVGGTG IDIPVLLLLI DGDEKMLTRI 240 

EKATQAQLPC LLVAGSGGAA DCLAETLEDT LAFGSGGARQ GBARORIRRF FFKGDLEVLQ 300 

AQVBRDfTRK ELLTVYSSED GSEEFETXVL RAXtVKACGSS EASAYLDELR LAVAWNHVDI 360 

AQSELFKGDI QWRSFBLEAS LMDALLNURP BFVRLLISHG LSLGHFLTPM RLAQLYSAAP 420 

SNSLIHNLLD OASHSAGTKA PALKGGAAEL RPPOVGHVLR MLLGXKCAPR YPSGGAHDPH 480 

PGQGFGESHY LLSDKATSFL SLDAGLGQAF WSDLLLWRLL LNRA<2G1MYF WEKGSNAVSS 540 

ALGACLLLRV MARLEPDAEE AARRKDLAPK FEGMGVDLFG ECYRSSEVBA ARLLLRRCPL 600 

WGDATCLQLA HQADARAFFA QDGVQSLLTQ KHWGDMASTT FXWU.VLAFF CPPLIYTRLI 660 

TFRKSEEEPT REELEFDMDS VIMSEGPVGT ADPAEKTFLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHFWGA FVTIFHGNW SYLLFLLLFS RVLLVDFQPA PPGSLELLLY FWAFTLLCEB 780 

LROGLSGGGG SLASGGPGPG HASLSQRLRL YLADSKNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID FMVFTVRLLH IFTVNKQLGP KIVIVSKMMK DVFFFLFFLG VWLVAYGVAT 900 

EGIiLRPRDSD FPSILRRVFY RFYLQIFGQI PQEDMDVALH EHSNCSSEPG FWAHPPGAQA 960 

GTCV5QYANW LWLLLVIFL LVANILLVHL LIAMFSYTFG KVQGNSDLYH KAQRYRLIRE 1020 

PHSRPALAPP FIVISKLRLL LRQLCRRPRS PQPSSPALBB FRVYLSKEAE RKLLTWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LRQLGHZREY EQRLKVLERE VQQCSRVLGW 1140 
VAEAbSRSAL LPPGGPPPPD LPGSKD 



SEQ ID N0219 PBF1 DMA SEQUENCE 

Nucleic Add Accession I: AA054237 

Cooing sequence: 1-894 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I.I I I 

ATGGASCCGC GGGCGCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 
CTGCTCGTCA CGGCCATCTT CACCGACCAC TGGTACGAGA CCGJUXCCCG GCGCCACAAG 120 
GAGAGCTGCG AGCGCAGCCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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CCGCTGTCGC ACCTGCCGCT GCGGGACTCG CC O CCGCTOS GGCGCCGGCT GCTOCCGGGC 240 

GGCCCGGGGC GCGOCGACCC CGAGTCCTGG CGCTCGCTOC TCGGGCTCGG CGGGCTGGAC 300 

GCCCAGTGCG GOCGGCCGCT CTTCGCCACC TACTCGGGCC TCTGGAGGAA GTGCTACTTC 360 

CTGGGCMCO ACCGGGACAT COACACCCTC ATCCTGAAAG GTATTGCGCA GCGATGCACQ 420 

GCCATCAAGT AOCACTTTTC TCAGCCCATC CGCTTGCGAA ACATTCCTTT TAATTTAACC 480 

AAGACCATAC AGCAAGATGA GTGGCACCTG CTTCATTTAA GAAGAATCAC TGCTGOCTTC 540 

CTDGGCftTGG CCGTAGCCGT CCTTCTCTGC GGCTGCATTG TGGCCACAGT CAGTTTCTTC 600 

TGGGAGGAGA GCTTGACCCA GCACGTGGCT GGACTCCTOT TCCTCATGAC AGGGATATTT 660 

TGCAOCAXTT CCCTCTGTAC TTATGCCGCC AGTATCTCOT ATGATTTGAA CCGGCTCCCA 720 

AAGCTAATTT ATAGCCTGCC TGCTGATGTG GAACATGGTT ACAGCTGGTC CATCTTTTGC 780 

GCCTGOTOCA GTTTAGGCTT TATTGTGGCA GCTGGAGGTC TCTGCATOGC TTATCCGTTT 840 
ATTAGCCGGA CCAAOATTGC ACAGCTAAAO TCTSQCAOAO ACTCCACGGT ATGA 



15 SEQ ID N0320 PBF1 Protein sentience: 

Protein Accession!: none tamd 

1 11 21 31 41 51 

on I I I I I I 

ZU KBPSALVTAL SLGLSLCSLG LLVTAIFTOH WYETDPRRHK BSCBRSRAGA DPPDQKNRM 60 

PLSHLPLRDS PPLGRRLLPG GPGRADPBSW RSLLGLGGLD AECGRPLFAT YSGLWRKCTF 120 

LOIDRDIDTL XUOGIAQRCT AIKYHFSQPI RLRHIPFNLT KTTQQDEWHL LHLRRITAGF 180 

LGHAVAVLLC OCXVATVSFF WEESLTQHVA GLULKTGIF CTISLCTYAA SISYDLHRLP 240 
KLIYSLPADV EHGYSWSIFC AWCSLGPIVA AGGLCIAYPF ISRTKIAQLK SGRDSTV 



SEQBN0221 PCM DMA SEQUENCE 

Nudele Add Accession* NM.016S70 

CwSng sequence 1-1 134 (undefined sequences eonespond to start and stopcodons) 

1 11 21 31 41 51 

I I I I I I 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTT006 60 

AAGGTTCCTQ AGAGCTATGT AGAGACTTCA GCCAGTGQAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAS TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGSACTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATG6TTT AGTTTATGAA OCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA AOCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCAIGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTCACAG TTACTGAGGA GCACATGCCA TTCTGGCAjGT TTTTTGTAAG ACTCTGTGGT 960 

ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGOATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 



55 SEQ ID PCM Protein sequence: 

Protein Accession t. NP.057654 



1 11 21 31 41 51 

I I I 1 I I 

MHHLHHKKTL SLVKBLHAFP KVPESYVETS ASGGTV5LIA FTTMMJ,TIM EFSVYQDTWM 60 

KYEYEVDKDP SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVPDLSPQQ 120 

KEWQRMLQLI QSRI/QEEHSL QDVIPKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 

VAGNPHITVG KAIPHPRGHA RLAALVNRES YNFSHRIDHL SPGELVPAII NPLDGTEKIA 240 

IDHNQMFQYF IOWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 

MVTVTEEHMP FWQFFVRLCG XVGGIFSTTG KLHGIGKFXV EIICCRFRLG SYKPVHSVPF 360 
EOGHTDNRLP LLENNTR 



- n SEQ ID N0223 PEZ3DNA SEQUENCE 

7U Nudelc Add Accession i; NH.001935.1 

Coding sequence: 76-2301 (imdenTned sequences eonespond to start and stopcodons) 

1 11 21 31 41 SI 

75 | | | | | | 

CGCGCGTCTC CGCCGCCCGC GTGACTTCTG CCTGCGCTCC TTCTCTGAAC GCTCACTTCC 60 

GAGGAGACGC CGACGATGAA GACACCGTGG AAGATTCTTC TGGGACTGCT GGGTGCTGCT 120 

GCGCTTGTCA CCATCATCAC CGTGCCCGTG GTTCTGCTGA ACAAAGGCAC AGATCATGCT 180 

ACAGCTGACA GTCGCAAAAC TTACACTCTA ACTGATTACT TAAAAAATAC TTATAGACTG 240 

80 AAGTTATACT CCTTAAGATG GATTTCAGAT CATGAATATC TCTACAAACA AGAAAATAAT 300 
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ATCTTGCTAT TCAATGCTGA ATATGGAAAC AGCTCAGTTT TCTTGGAQAA CAGTACATTT 360 

GAIGAGTTTQ GACATTCTAT CAATCATTAT TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

TTAGAATACA ACTACGTGAA GCAATGGAGG CATTCCTACA CAGCTTCATA TGACATTTAT 480 

GATTTAAATA AAAG6CAGCT GATTACAQAA GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

ACATGGTCAC CAGTGGGTCA TAAATTGGCA TA JG TTTGG A ACAATGACAT TTATGTTAAA 600 

ATTGAACCAA ATTTACCAAG TTACAGAATC ACATGGAOGG GGAAAGAAGA TATAATATAT 660 

RATGGAATAA CTGACTGGGT TTATGAASAG GAAGTCTTCA GTGCCTACTC TGCTCTGTG G 720 

TGGTCTCCAA ACGGCACTTT TTTAGCATAT GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

ATT6AAXACT CCTTCTACTC TGATGAGTCA CT6CAGTACC CAAAQACTOT ACGGGTTCCA 840 

TATCCAAAGG CAGGAGCTGT GAATCCAACT GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CTCAGCTCAG TCACCAATGC AACTTCCAIA CAAATCACTQ CTCCTGCTTC TATGTTGATA 960 

GGGGATCACT ACTTGTGTGA TGTGACATGG GCAACACAAG AAAGAATTTC TTTGCAGTGO 1020 

CTCAGGAGGA TTCAGAACTA TTCGGTCATO GATATTTQTQ ACTATGATGA ATCCASTGGA 1080 

AGAIGGAACT GCTTAGTGGC ACGGCAACAC ATTGAAAXGA GTACTACTGG CTGGGTTGGA 1140 

AGATTTAGGC CTTCAGAACC TCATTTTACC CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

AGCAATQAAG AAGGTTACAG ACACATTTGC TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

TTTATTACAA AAGSCACCTG GGAAGTCATC GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

7ACTACATTA GTAATGAAXA TAAAGGAATG OCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

CTTATTGACT ATACAAAAGT GACATGCCTC AGTTGTGAGC TGAATCCGGA AAGGTQTCAG 1440 

TACTATTCTQ TCTCATTCAG TAAAGAGGCG AAGTATTATC AGCTGAGATG TTCCGGTCCT 1500 

GGTCTGOCCC TCTATACTCT ACACAGCAGC GTGAATGAIA AAGGGCTGAG AGTOCTGGAA 1560 

GACAATTCAG CTTTGGATAA AATGCTGCAG AATGTCCAGA T6CCCTCCAA AAAACTGGAC 1620 

TTCATTATTT TGAATGAAAC AAAATTTTGG TATCAGATGA TCTTGCCTCC TCATTTTGAT 1680 

AAATCCAASA AATATCCTCT ACTATTAGAT GTCTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

GACACTGTCT TCAGACTGAA CTGGGCCACT TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

AGAAGACTGG GAACATTTGA AGTTGAAGAT CAAATTQAAG CAGCCAGACA ATTTTCAAAA 1920 

ATGGGATTTG TGGACAACAA ACGAATTGCA ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

ACCTCAATGG TCCTGGGATC GGQAAGTQGC GTOTTCAAGT GTG6AATAGC CGTGGCGCCT 2040 

GTATCCCGGT GGGAGTACTA TGACTCAGTG TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

CCAGAAGACA ACCTTGACCA TTACAGAAAT TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

AAACAAGTTG AGTACCTCCT TATTCATGGA ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

TCAGCTCAGA TCTCCAAAGC CCTGGTGGAT GTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACTGATGAAQ ACCATGQAAT AGC TAGCASC ACAGCACACC AACATATATA TACCCACATG 2340 

AGCCACTTCA TAAAACAATG TTTCTCTTTA OCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

AAGCTTATTA AAACTCATTT TTGTTTTCAT TATCTCAAAA CTGCACTGTC AAGATGAT6A 2460 

TGATCTTTAA AATACACACT CAAATCAAGA AACTTAAGOT TACCTTTGTT CCCAAATTTC 2520 

ATACCTATCA TCTTAAGTAG GGACTTCTGT CTTCACAACA GATTATTACC TTACAGAAGT 2580 

TTGAATTATC CGGTCGGGTT TTATTGTTTA AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

CAAATAGGAA TTGTTTTTAT GGAGGCTTTG CATAGATTCC CTGAGCAGGA TTTTAATCTT 2700 

TTTCTAACTG GACTGGTTCA AAT GTTGTTC TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

AGTGATGTCA CTAGGGCAGG GACAGGATAA GAGGGATTAQ GGAGAGAAGA TAGCAGGGCA 2820 

TGGCT6GGAA CCCAAGTCCA AGCATACCAA CACGAGCAGG CTAjCTGTCAG CTCCCCTCGG 2880 

AGAAGAGCTG TTCACCACGA GACTGGCACA CTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

CAGGAAA.TCA AAXA1CGAAA GCACTGACTE CEAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

AAAGAAATGT AAGGGAAACT GCCAGCAACG CAGCCOOCAG GTGCCAGTTA TGGCTAIAGG 3060 

TGCTACAAAA ACACAGCAAG GGTGATGGGA AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

TACTGATGTT CCTAGTGAAA GAGGCAGCTT GAAACTGAGA TGTGAACACA TCAGCTTGOC 3180 

CTGTTAAAAG ATGAAAATAT TTGTATCACA AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

TTTTTCTTAT TTCATTTCTT TGAGTGTCTT AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TCATTTTAAA AAATGGAACA TAAAATACAA TGTTATGTAT TATTATTCOC ATTCTACATA 3360 
CTATGGAATT TCTCCCAGTC ATTXAAXAAA TGTGCCTTCA TTTTTTC 



SESLffi HQ3M PEZ3 Protein seaxmm 

Protein Accession t: NP.00192&1 

1 11 21 31 41 SI 

I I I I I I 

MKTPWKILLG LLGAAALVTI ITVFWLU9K GTDDATADSR KTYTLTDYLK NTYRLRLYSI. 60 

RWISDHBYLY KQENNILVFN AEYGNSSVFL EHSTFDEFGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLNKR QLITEERIPH NTQWVTWSPV GHKLAYVWNN DIYVKIEPNL 180 

PSYRIWTGK EDIIYNGITD WYEEEVPSA YSALWWSPNG TFLAYAQFND TEVPLIBYSP 240 

YSDBSLQYPK TVKVPYPRAG AVNPTVKPFV VNTDSLSSVT NATSIQITAP ASHLIGDHYL 300 

CDVTWATQBR ZSLQWLRRIQ NYSVMDICDY DESSGRHHCL VABQH1EMST TGWVGRFRPS 360 

EPHFTLDGNS FYKIISNEEG YRHICYFQID KKD C TFI T K G TWEVIGIEAL TSDYLYYISN 420 

EYKGHPGGBN LYKIQLIDYT KVTCLSCELN PERCQYYSVS FSKEAKYYQL RCSGPGLPLY 480 

TLHSSVNDKG LRVLEDNSAL DKMLCNVQMP SKKLDFIILN ETKFWYQMIL PPHPDKSKKY 540 

PLLLDVYAGP CSQKAOTVFR LSWATYLAST EOTIVASFDG RGSGYQGDKI MHAINRRLGT 600 

FEVEDQIEAA RQFSKMGPVD NKRIAIKGWS YGGWTSMVL GSGSGVPKCG IAVAPVSRWE 660 

YYDSVYTERY KGLPTPEDNL DHYRNSTVHS RAENPKQVEY LLIHGTADDN VHFQQSAQIS 720 
KALVDVGVDF QAHWYTDEDH GIASSTAHQH IYTBMSBFIK QCFSLP 

SE0 ID H0325 PBJ2 DNA SEQUENCE 

Nudete Add Accession*: ncnehwnd 

CoSnnsequBnce: 1-261 (undefined sequences cwrespond lo start and slop codtxts) 



1 11 21 31 41 51 

I I I I I I 
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ATGGCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTCACAAC 60 

AGAAGTGTGA TTAAAGTGCG TGCTAACCAQ TGTTOOCTGC ATGAGGCAGA AAQTQAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG CTCCTCTTGA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAAXGA GGCCTCTG T C A GT CTOGTCC CTGAGAGATG ACAAGGAGCA GAGCCCCCAC 240 
CAGOCCACAC TGGATGTCTA A 



SEQ ID N0326 PBJg Protein sequence: 

Protein Accession*: none found 

1 - 11 21 31 41 51 

I I I I I I 

HALAKVRKPN ANDNAIRVDN RSVIKVRANQ CSLHEAESES RHPQELWHGL LLLHGVLEAC 60 
VEMRPLSVWS LRDDKEQSPH QPTLDV 

SEQ ID K0227 PBM2 DNA SEQUENCE 

Nucleic Add Accession #: none found 

Cooing sequence: 1-462 (underBned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I II 

ATGOCAAATG CTOAGTTAQA AGCAAAGAGC CTTGQRAGCA GTAAATGTTT AAAAACTGCT 60 

CTCATACTT6 CTGTATGTTG TGGATCAGCA AATATAGTCA GCCC T CTACT TGAGCAAAAT 120 

ATTGATGTAT CTTCTCAAGA TCTG6ACAGA C6GCCAGAGA GTATGCTGTT TCTAGTCATC 180 

ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG GCTGGGGGAA GCTASAAGAT 240 

TTTATGGCTA tTTGAAGAAGA AATGAAGAAfi CACGGAASXA CTCATGTGGG ATTCCCAGAA 300 

AACCTGACTA ATQGT6CCGC TGCTGGCAAT GGTGATGATO GATTAATTCC TCCAAGGAAG 360 

AGCAGAACAC CTGAAA6CCA GCAATTTCCT GACACTGAGA ATCAAGAGTA TCACAGGTTT 420 
GTCAAAGATC AGATAGTTGT AGATATGCGG CGTTATTT CT GA 

SEQ ID N0328 PBM2 ProMn sequence: 

Protein Accession I: none found 



1 11 21 31 41 51 

I I I I I I 

MPNABLEAKS LGSSKCLKTA LILAVCCGSA NTVSPLLEQH IDVSSQDLDR RPESMLFLVI 60 
. XMWTSFVEDN tSBQWGKLED FMAIEEEHKK HGSTHVGFPE KLTNGAAAGK GDDGLIPFRK 120 

40 SRTPESQQFP DTENEEYHRF VKDQIWDMR RYF 



SEQ ID N0229 PEZ2 DNA SEQUENCE 

Nucleic Add Accession* NMJM4253 

Coding sequence 65-8242 (undefined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GACTGCTTGC ATTAAASGAC TTCCTCATCC TTTTTTTCAT GAAACTGAGC TTGCTTAATC 60 

AGAGATGGAG CAAACT6ACT GCAAACCCTA CCAGCCTCTA GCAAAAGTCA AGCATGAAAT 120 

6SATCTAGCT TACAOCAGTT CTTCTGATGA GAGTGAASAT GSAAGAAAAC CAAGACAGTC 180 

ATACAACTCC AGGGAGAGCC TGCACGAGTA TAACCACGAG CTGAGGATG& ATTACAATAG 240 

CCAGAGTAGA AAGAGGAAAC AAGTAGAAAA ATCTACTCAA GAGATG6AAT TCTGTGAAAC 300 

CTCTCACACT CTGTGCTCTG 6CTACCAAAC AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

CCAGCTAGAG ATGGGATCTG ATGTGGACAC AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

TGCACTAAGA ATGTGGATAA GGGGAATGAA AXCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

GGCCAACTCT GCATTATCCT TGACTGACAC TGAOCATGAA AGGAAOTCT0 ATCGGGAAAA 540 

TGGTTTCAAA TTCTCTCCTG TTTGTTGTGA CA3GGAGGCT CAAGCTGGGT CTACTCAAGA 600 

TGTGCAGAGC AGCCCACACA ACCAGTTCAC CTTCAGACOC CTCCCACCGC CACCTCCGCC 660 

TCCTCATGCC TGCACCTGTC CCAGGAAGCC ACCCCCTGCA GOGGACTCTC TTCAGAGGAG 720 

ATCAATGACT ACCCGCAGCC AGCCCAGCCC AGCTGCTCCA GCTCCCOCAA CCAGCACGCA 780 

GGATTCAOTC CATCTGCATA ACAGCTGGGT CCT6AACAGC AACATAOCA? TGGASACCAS 840 

GCATTCCCTG TTCAAACATG OATCTGGTTC CTCTGOGATC TTCAGTGCAG CCAGTCAGAA 900 

CTACCCTCTG ACATOCAATA CCGTGTACTC GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

CTTTTCCCGA OCTGCCTTTA CCTTTAACAA ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

AGCATTGAGC GCCACTGCAA TCACAGTGAC TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1080 

AGTGCATTTC TTCGGCCTGA CTTGGCAGTT GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

TGGAGTTAGC AAAGGQAACA GGGGGACCGA GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGGAAAAGTT TCTGATAAAT CAGAGAAAAA AGTGTTTCAG AAGGGAOGGG CGATAGACAC 1260 

70 TGGAGAAGTT GACATTGGTG CACAGGTCAT GCAGACCATT CCACCTGGTT TATTCTGGCG 1320 

TTTCCAGATT ACTATCCACC ATCCAATAIA TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CTCTCTGCTG GGAATTTATG GCAGAAGAAA CATTCCACCT ACACATACTC AGTTTGATTT 1440 

TGTAAAACTA ATGGATGGCA AACA GCTG GT CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

__ ACAGCACTCC CCTCGGAACC TGATCTTAAC TTCG CTTCA G GAGACAGGTT TCATAGAGTA 1560 

75 TATGGATCAA GGACCTTOGT ATCTGGCGTT TTACAATGAT GGAAAAAAGA TGGAGCAAGT 1620 

ATTCGTGTTA ACTACAGCAA TTGAAATAAT GGATGACTCT TCAACCAATT GCAATGGAAA 1680 

1GGAGAGTGT ATCTCTGGCC ATTGTCATTG TTTCCCAGGA TTOCTT GG AC CTGACTGTGC 1740 

TAGAGATTOC TGCCCTGTGC TGTGTGGTGG GAATGGAGAA TAOGAGAAAG GACACTGTGT 1800 

OA CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG TGACGTTCCG GAAGAACAAT GCATTGATCC 1860 

oO AACATGCTTT GGCCACGGCA CCTGCATCAT GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 
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AGGAGAAATA TGOGAGGAAG AGGACTGCCT AGACOCAATG TGTTCCAACC ATGGCATCTG 1980 

TGTAAAACGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTQTQ AAACACCACT 2040 

TOCTGT A TGT. CAAGAGCAGT GCTCAGGACA CGGAACTTTT CTTCTGGACG CTGGAGTATG 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAACA GAGCTGTGTA CCATGGAGTG 2160 

TGGTAGCCAT GOAGTCTGCT CAAGAGGAAT TTGCCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAAC6CTCCT GTCATTCTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAO TGTAGCCCTQ GA.TGGGAGGG COACCACTOC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGC7GCC - CAGGGCTCTO CTTTGGAAAT GGACGATGTA CCCTGGATCA 2100 

AAATGGTTGG CACTGTGTG? GTCAGGTGGG TTGGAGTGGQ ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTGGACAA TGATGGAGAT GGTTTAACCG ACTGTGTGGA 2520 

TCCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTTTT 2640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTCCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTGTGAT TCGAGGCCAA GTGOTGGCCA TAGATGGAAC 2760 

TCCTCTAQTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTCGTG G C CATCGGTQGC ATCTCTGTCA TCTTAATCTT 2880 

CGACCGATCC CCTTTCCTQC CTGAGAAJSAQ AACACTCTGG TTGCCTTGGA ATCAOTTTAT 2940 

TGTGGTAQAG AAACTCACCA TGCAGAGAC? TGTATCAGAC CCGCCATCCT GCGRTRTCTC 3000 

CAACTTTATC AGCCCAAACC CT A TTQTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAOAQ AGGGGAACTA TTGTTOCTQA GCTGCAGGTT GTACAGGAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCCC ACCCCTGGGT ATAAAACGCT 3180 

GCTACGGATC CTTCTCACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTCCC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT 7GGAACAAGA CCGATATCTA TGGACAGAAG GTTTGGGGCC TGGCAGAGGC 3360 

TTTGGTATCT GTGGGA3ATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGQ AATCATACAT AAAGGGAATQ OAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCGC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TGTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCOCTGA3X3 GCAOTGTGTA TSTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTCCCTCG GGAAACTOOG TTACTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGAOOCTG TGTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCATCGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTTTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATOGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA CCTTGCAGTA AATCCTATGG ACAATTCAIT 4200 

GTATGTCTTG GAZAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCCCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTCCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTGAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

OGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATCTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGCGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TCGCCGTGAT GCAGGCGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GASTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAAICCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTT TATGAGTATG ACCCCGAGGG 5040 

ACACCTGACC AATGCAACGT TTOCCACTGG AQAGGTCAGC AGCTTCCACA GTGAOCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATCTT AAAACAAGAA AATACTCAAA GTACCTATCG 5220 

GGTGAATCCA GATGGTTCCC TGCGTGTCAC TTTTGCCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAAOCCTACC CTGGGCAAAT GCAACATCTC- 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT OGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA GTGGGCGACC CATTCTGTGG TCTCCTGTAA GCAGATATAA 5580 

TGAAOTGAAC ATCACATATT CACCTTCGGG ATTGGTGACG TTTATTCAAA GAGGAACGTG "5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTATT TCAAGAACTT GGGCTGATCG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATQ CTTCTCCTAC ACAGCCAGCG 5760 

GCCTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTGCGCCAC AGCTTACAAA CCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAQAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTTGTGA ATGCACGGTT 6180 

CGACTACAGC TACAACAATT TCCOAQTCAC AAGCATGCAA GCTGTAATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GSTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGCCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCAXAAG 6480 

GGTAGGAGTA GATGCCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATSGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACCCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TCGTCTTACT CCTCTCCGAT ATGACCTCCG 6660 

AGACCGCATC ACCAGATTAG GAGAAATTCA GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGAGGGGA AATCATATTT TTGAAXATAA TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 
TAAGGCTTCT 6GCTGGACTG TGCAGTATTA CTATGATGGG CTTGGGCGAC GTGTCGCGAG 6840 
TAAGTCCAGC CTAGGGCAGC ACCTTCAOTT CTTTGTCQAC GCGACCGCGA ACCCCATAAG 6900 
AGTTACTCAT TTGTACAACC ACACAAGCTC GGAGATTACA TCTCTGTATT ATGAICTCCA 6960 
AGGTCACCTT MTGCCATGG AGTTAAGCAG TGQTGAAGAA TATTATGTAG CCTGTGAIAA 7020 
TACAGGTACC CCACTAGCTQ TGTTCAGCAG CCOAGGTCAG GTCATAAAGG AQATACTATA 7080 
CACACCTTAT GGCGATATCT ATCATGACAC TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 
TCATGGAGGA CTCTATGATT TCCTTACTAA ATTAGTGCAC CTGOGGCAAA GGQATTATGA 7200 
TQTTGTTGCT GGCAGATGGA CAACGOCCTA TCATCACATA TGGAAACAGT TGAACCTCCT 7260 
TOCTAAACCA TTCAACCTCT ACTOCTTTGA AAAXAACTAC CCAGTTGGCA AAATTCAAGA- 7320 
TGTTGCAAAQ TATACCACAG ACATCAOAAG TTGOTTGQAQ CEATTTQGTT TCCAATTACA 7380 
CAATGTACTA CCTGGATTTC CCAAACCTGA ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 
TCTACGOCTT CAGACAAAAA CTCAAQAGT6 GGATCCTGGA AAGACTATCC TGG6CATTCA 7500 
GTGTGAACTC CAGAAACAGC TCAGGAATTT CATTTCCTTQ GACCAACIAC CTATGACTCC 7560 
CCGATACAAT GATGQACGGT GCCTTC3AAGG AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 
TTCTCTTTTT GGGAAAGGTA TAAAATTTGC CATCAAGQAT GGCATAGTAA CAGCTOATAT 7680 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG GCTTGCTGCC ATTCTCAATA ATGCCCATTA 7740 
CCTGSAAAAC CTACATTTTA CCAXAGAGGG GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 
GTCTCTGGAG 6AAGACCIGG TCCTCATCGG TAACACTGGG GGGA6GCGGA TTCTGGAGAA 7860 
TQGTGTCAAT GTCACTGTGT CCCAGAT6AC TTCTCTGTTO AATGGGASGA CTAGAOGGTT 7920 
TGCAGATATT CAGCTCCAGC ATGQAGCCCT OTGCTTCAAC ATCCGOTATQ GCACAACTQT 7980 
G6AAGAG6AA AAGAATCACG TGTTGGAGAT TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG GGAAGAGGGO ATTAGGGCAT GQACAOAAGQ 8100 
GGAAAAGCAG CAGCTTTTQA GCACTGGGCG GGTACAAGGT TAOQATGGGT ATTTTGTTTT 8160 
GTCTGTTGAG CAGTATTTAG AACTTTCTGA CAQTGCCAAT AATATTCACT TTATGAGACA 8220 
GAGCGAAATA GGCAGGAGGT AAC AAAAATA TCTCTGCCTT TGCGTCACCA AAGACTGGCT 8280 
GTTTTTAAAA CATAAAATGG TTTATTGTAT TGGTTTTCTA GATCAGAACT CTGXATATGT 8340 
AAAXATGGAG GAAAAACATA TCCAACTGCC TTTCAATGTS ACGGAACATO GTATTTTAAT 8400 
ATT G TTTG T T TAAACTCTTT AAQAAATGAC AGAGATTTTT AGTTC1TGTG TGGCAGTATT 8460 
CAAAATAACA CAABTAGAAC TCAAACAGCT AAAAACAGTT TTCAGAAACC ACCACTTTCA 8520 
ATTTGCCGAG CCATGCATAT GTTCCAATAT CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 
TTGTGAGAAG CAGCTTCATC CTTAACTQTT GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 
GTGCAATAGT ATCTGAAACT TGCCTTTCGA AA6ACTG0CA GCCCTTTGAC GTTTTCCAGA 8700 
TCTGTTATAG GAAACTTAAA AACAGGTGTA AAATGTCTTC AGCCACCATC TCCTAGASTG 8760 
AGGACCCAAT TGCCCTTCCT TCTTGATTAT TCCTCCTTGC TTGTTAAAGT AAATGCCATA 8820 
TTGTTGTGCT GTGTTTTGGC GTGTGGTGGC TGGGTTCTGT CTAOCATGCT TCCCTGTGGG 8880 
TGTGGTAACC AQACTGTATA GCCGCTATTT GCTCGTGTGT ACAT6ATACC AAAGCAGCTG 8940 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 
GCTCTATTGG TATCATGTAA ACATAGCTTT TATTAACCTG GOTAGGAATT TCTCATTTAT 9060 
ATATAGGATG TGTTTTGGTC ATAGTTTCAC ATTAGTGATT CAjGTATCTAT ACACTGACCC 9120 
AATGGTTTTG TGCACATGAA CGGTAATTTA CTTAAAAGTA TCATTCTGGT ACAAAAACAA 9180 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG GATGCCGATA CATACATTAA CTACTACTGC 9240 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA AAXAGACCTG GTACTTAAGT GAAAGTACTA 9300 
AAGGGAAGAC CAGACCAAAC ATCACAGCAG TTGCTCCCAC ATTGTTTCAJ3 CCCACTTAGA 9360 
TTTATCTTTC AAATGTACAA TTCTGTATTG AACATCTCCC AGCCATCTTC AGGAAATCGA 9420 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 
TTTACTAAAA TAATTTATAC AGTTAGTTAT TTTCGTTCTC COTACTTACC CATTTATCTT 9540 
TATTTAATCG TCTCTACTGC CTAJ3GAAAAT AACTATTTTC CAGGACGGGT TATTTGTTCT 9600 
GCGATCATTT AAAATTTGGA GAAAGGTCAG GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 
AAICTCTAGG AATCCTGCAG TAAAACAAGC OO CTT GGTGA GCTGGAAGAT TTGTGCCCAG 9720 
TGACAAAGAG ATAGTTTGTA AAATGCTGTG TAATTGTAAG TTACCACAAA TGAAAATACA 9780 
TGACAGCACA ATGTGGCCCG TAGAAAATTC CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9840 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 
TTCCAGATGC TACCTAAATG CAGTGTGGGG TCATT G CCCT GCTTTGCGAT GACAGTTtCT 9960 
TTGAAAATAT GCAAAGTCAT AAGCTCATGT TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC OCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 
AACATTTCAT TTTCAAAACC TTCGGGTTAQ AATACCACTT ACACATQTAT TCTGAGAGAC 10140 
AGAATTCATG AGGAACTCAT CTCTCTTTAT AACTGGAAAC ACACCAGCTT GATATATTGC 10200 
TAATCCATAC TAAAATCATA TTATTGGGTT TTTTCTOAAT CAGGCCTGTA TTAATGGTAC 10260 
AGTATTTATT CAGAATGGAA TTCTAAAATT ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 
TCCACACCAA CCTAAAAATG GACCTTAAGT TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 
ATGGAAAAAT AATTTGTGAA CTGTATATAG AGAGTGCATT CATAAATGTG ATTATGTATT 10440 
TTATCACAAA TCCAAAATGT CAATATTAGA GTCTATTTTQ CTTATATTTT AAGCAATTAT 10500 
ACGTTTTTGC AATTCATTGA TGATGTATCA TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 
ACAAATATTT GAAGCTTTTA CTTAATAGTO ATTACCTTGA ACTGTGCATT TCTAGTTTGT 10620 
AATACGTATT TGGTTGGTTC GTGOCTTTAG TTTCTTAAAG TTACATTTGT ATTATATTCA 10680 
GGAAATGCAC TTTTTATTAC TTACAGCTGT GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 
TCTTTTTACA ACTCCTAAAG CTTGAGGGAG GAAAGAAAAA AAAAACAAAA CTACEAATCA 10800 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 
TATCACTTCC TATTCAGCTG AATAGAAAGA ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 
TAAATTATTG AAAGAACAAT TCGTTTGCAT TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 
GAAACATATG AATTTCTCAT ACCCAGCAGA CAGATGGCTG ACACTGCACA GCCACACACC 11040 
ATTCGAGTAA GTTAAAGTGA GAGCAXAGTA GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 
CTGGAGGCAG GGAATACTCC ATGGTTGTTT CTTTTTCCTA CTTAAGCCCA TTTTGITTGT 11160 
GCTTTTCTGT TTTGTTTTGT TTTCACTCTT GCACTACAGT CTAGAGATCC AAATGAACTG 11220 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 
TTGATTAGAA GCATGACTCC TGAAGGAAAG GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 
AACAAAACAC TTTTACCATA TAAATAAGTA TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 
AAAATAAGTG TGTCCTTTAC TGTCAATTTA TCGAGAAGAT CTATAAIATA TAGACTACAT 11460 
ATATATAATA TATACAACAT AGCCAAATGT ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TCACATGCTA CCTATGTAGA CAGGTATGRA ATTAAGTTAT AATTTTCATG AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTOC ATCATGTTAT TTTGACTCTT TTTCTTTTTT 11640 
TTTTCnTAA AAATATATTT TTAACTAGAC CAGGCCCCAC TATAATATCA CTTOAGAGAG 11700 
TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGGTGATT GTAATGGAGT 11760 
S TCATTGGTAA TAGAAGCAAA AGTACAQTAA CGAAGTATTG AAAAGAAAAT TTTGGAGACA 11620 
TTGGAGCAXA TTATATATAG CTTGTGGAAA GACAIAAGGC TACAGATGGA ATGGAACATT 11880 
O CTQTTTT CT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CJCAGCCTTC TATAARGGTT CTTTCTTCTG CAAAQAAAAC AAAACAAAAC AAAACAAAAC 12000 
. AAAAAAAAAC AAAAAAACCO CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 

10 AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA ATAGTGACTA TTATTTTCAG 12120 
TGTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAAXAXACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG ACTTAAACTT GCTGTOGATT TTGTCTTGGC 12240 
AOTTGTCATC TTACATTATT TGTCAAAGGA AAT GTGT T TG GCAGTTAAAA ATCTTTCCTT 12300 
AGATTTAGTG GT66ACTTTA ACCTCTTAAA TAAATGTXAG TATATCAGAT TGTGTCCTTG 12360 

IS AAAAATATTT TACTTGTAXG AAXCATQACA ACGTCTAAA? CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAA ACATT 12480 
CTTTTTAGCT G CCTA C 1T1 C TCATGAAAAG TAAAGATGTT TACAGTGTAT GCCAAGTTTT 12540 
CAQTTTCTOT ATAACAACAO GTAGAGGTTC TAATCAXATT GAAAATTGTG TTATAATGGT 12600 
_ CTGAOCCATS TTGCTAGGAA ACAATAGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTG 12660 

20 AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTXAAAT GGTCTTTTGC 12720 
ATT T TG C TCT GGCTTTCAGG CCAGAAGCAT GCATTTTTCT ACAAGAGCAT. CAC AACAA CA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATG TGTTGAXATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TEAAAAAATA AAATTGCCAA TGAAAAAAAA 

25 
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30 1 11 21 31 41 51 

II' 111 

MEQTDCKPYQ PLPKVKHEKD LAYTSSSDES EDGRKPRQSY NSEETLHBVN QBLKMNYHSQ 60 

SRKRKEVEKS TQEMEFCETS HTLCSGYQTD MHSVSRHGYQ LEKGSDVDTB TEGAASPDHA 120 

_ LRMWIRGHKS EBSSCLSSHA HSALSLTDTD HERKSDGENG PRPSPVCCDH EAQAGSTQDV 180 

35 QSSPHNQFTP RPLPPPPPPP HACTCARKPP PAADSLQPJtS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWVL NSNIPLETRH SLFKHGSGSS AIFSAASQNY PLTSNTVYSP PPHPLPRSTP 300 

SRPAPTFNKP YRCCNWKCTA LSATAITVTL ALLLAYVIAV HLFGLTWQLQ PVEGELYANG 360 

VSKGNRGTES UDTTK5PIGG EVSDKSEKKV FQKGRAIDTG BVDIGAQVMQ TIPPGLPWRF 420 

. QITIHHPIYL KFNISIAKDS LI/SIYGRENI PPTHTQFDFV KLHDGKQLVK QDSKGSDDTQ 480 

40 HSPRNLILTS LQETGPIEYM DQGPWYlAFTf NDGKKMEQVF VLTTAIBIMD DCSTNCNGNG 540 

ECISGBCHCP PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGWKGPBCD VPEEQCIDPT 600 

CPGHGTCIMG VCICVPGYKG EZCEEEDCLD PHCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQEQCSGHG TPLU1AGVCS CDPKWTGSDC STETjCTMBCG SHGVCSRGIC QCEEGWVGPT 720 

CBBRSCHSHC TEHGQCKDGK CBCSPGWEGD HCTIAHYLDA VRDGCPQLCP GNGRCTLDQH 780 

45 GWHCVCQVGW SGTGCNWHE MLCGDHLDND GDGt-TDCVDP DCCQQSHCYT SPLCQGSPDP 840 

LDLIQQSQTL PSQHTSRLPY DRIKPLIGKB STHVIPPEVS FDSRRACVIP. GOWAIDGTP 900 

LVGVNVSFLH HSUYGP TI SR QDGSFDUVAI GOISVILIFD RSPFLPEKFJ LWLPWNQFIV 960 

VEKVTKQRW SDPPSCDISN FISPNPIVLP SPLTSFGGSC PERGTIVPBL QWQEEIPIP 1020 

SSFVRLSYLS SRTPGYKTLL RILLTHSTIP VGMTKVHLTV AVEGSI/TQKH PPAAINLVYT 1080 

50 FAKNKTDIYG QKVWGLAEAL VSVGYEYBTC PDFILWEQRT WLQGFEHDA SNLGDWSUK 1140 

HHILNPQSGI XEKGHGENMF ISQQPPVIST HflMGHQRSV ACTNCHGPAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIP PSGNSVSILE LSTSPAHKYY LAHDPVSBSt. YLSDTNTSKV 1260 

YKLKSLVETK DLSKNPEWA GTGDQCLPFD QSHCGOGGRA SEASLNSPRG ITVDRHGFIY 1320 

FVDGTM1RKI DENAVITTVI GSNGLTSTQP LSCDSGMDIT QVRLEWPTDL AVNPKDNSLY 1380 

55 VLO01IVLQX SBKRKVRIIA GRPIHCQVPG IDHFLVSKVA IBSTLBSARA ISV5HSGLLF 1440 

IAETDERKVN R100VTTNGB IYI1AGAPTD CDCKIDPNCD CFSGDGGVAK DAKHKAPSSL 1500 

AVSPDGTLYV ADLGNVRIRT ISRNQAHLND MMIYEIASPA DQELYQPTVN GTHLHTUILI 1560 

TMJYVWJFTY HSEGDLGAIT SSNGNSVBIR EDAGGMPLWL WPGGQVXHL TISSNGVLKR 1620 

VSAQGVNPAL MTYPGNTGLL ATKSNEKGWT TVYEYDPEGH LTHATPPTGB VSSFHSDLEK 1680 

60 LTKVELDTSK REHVLHSTNL TATSTIYILK QENTQSTYRV KFDGSLRVTF ASGHBIGLSS 1740 

EPHILAGAVN PTLGKCNISL PGEHNANLIE HRQRKEQHKG NVSAFERRLR AHNBNLLSID 1800 

FHHITRTGKI YDDHRKPTLR ILYDQTGRPI LWSPVSRYNE VN3TYSPSGL VTFIQPJ3TWU 1860 

EKMEYDQSGK I1SBTHADGK IWSYTYLEKS VKLLLHSQRR YIFEYDQSDC LL5VTHPSHV 1920 

RHSLQTMLSV GYYBNIYTPP DSSTSPIQDY SRDGBLLQTL HLGTGRRVLY KYTKQARLSE 1980 

65 VLVDTTQVTL T1TEESSGVIK TIHLMHTX3FI CTIRYBQTGP LIQRQ1PRPS BEGLVHARFD 2040 

YSYNNPF-VTS MQAVINETPL PZDLYRXVDV SGRTEQFGKP SVINYDUJQV ITTTVMKHTK 2100 

IPSAKGQVIE VQYEILKAIA VWMTIQYDNV GRHGNHCIKV GVDANITRVP YEYDADGQLQ 2160 

T/SVNDKTQW RYSXDUIGOI NLtSHGKSAR I/TPLRXDLRO RITRLGEIQY KBDEDGPLRQ 2220 

RGNDIFBXNS NGLLQKAYNK ASGWTVQYYY DGLGRRVASK SSLGQHLQFF VDATANPIRV 2280 

70 THLYNHTSSB ITSLYYDLQG HLIAHELSSG EEYYVACDNT GTPtAVFSSR GQVIKBILYT 2340 

PYGDIYBDTY PDP07IIGFH GGLYDFLTKL VHD3QRDYDV VAGRWTTAYH HriJKQLMLL? 2400 

KPFNLVSPEN NYPVGKIQDV AKYTTDIRSW LELPGFQLBH VLPGPPKPEL EHLBLTYELL 2460 

RIiQTKTQEWD PGKTILGIQC BLQKQLRUFI SLDQLPMTPR YNDGRCLEGG KQPRFAAVPS 2520 

_ VPGKGIKFAI KDGIVTADII GVANEDSRRL AAILNNAHYL EHiaPTIEGR DTHYPIKLGS 2580 

75 tEEDLVLIGM TGGRRILENQ VNVTVSQWTS UMGRTRRPA DIQLQHGALC PHIRYGTTVE 2640 

EEKNHVLEIA RQRAVAQAWT KBQRRLQEGE EGIRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VEOYLELSDS ANNIHPHRQS BIGRR 

SEQ ID MO231PFD40NA SEQUENCE 

80 HudeicAtW Accession f. NMJ0OO441 
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Codftg sequence: 



2254567 (undefined sequences conespond to dart and slop colons) 



1 



IX 



21 



31 



41 



51 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CTCAGCCTTC CCGGTTCGGG AAAGGGGAAG AATOCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC COCAOCCTGT CTGTCTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTCCTCCOT A AATAAAACGT CCCACTGCCT TCTGAGAGCG CTATAAAGGC AGCGGAAGGG 180 

TAGTCCCCG G GGCATTCCGG GCGGGGCGCG AGCAGAGACA GGTCATGGCA GCGCCAGGCG 240 

GCAGGTCCOA GCCGCOGCAO CTCCCCQAOT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTCGCTTTC CAGCAACAGC AGGAGOGGCG CCTGCAGGAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TQCTGCAGTT GTTCAAQAAA GAGAGCCTTT GGTGTGCTAA 420 

AGACTCTTGT GCCCATCTTO OAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATOCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACAIA TCTCAGTTGO ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG COOCCGACGA ACACTTTCTC QTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT AGTACTATGA TAGACACTGC AGCTAGAGAT ACAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTO ACICTG CT GG TTGGAATTAT ACAGTTGATA TTTGOTGGCT 840 

TGCAGATTGG ATTCATAGTG AGOTACTTGG CAGATOCTTT GQTTGGTGGC TTCACAACAG 900 

C1 W.1 WXH T CCAAGTGCTG GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAAOCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CQCTGGTTGA QATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCAOCAT TGTCGTCTGT ATSGCAQTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCOCAGTCCC TATTCCTATA GAAGTAATTQ 1140 

TGACGAXAAT TGCTACTGCC ATTTCATATQ GAGCCAACCT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGG TTT T TGCCTCCTGA ACTTCCACCT GTGAGCTTOT 1260 

TCTCGGAGAT G&GGCTGCA ' i X J V'l'rrfO UA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT OGATGGGAAC CASOAATTCA 1380 

TTGCCTTTGQ GATCAGCAAC AT CTTCTCAG GATTCTTCTC TTGTTTTGTG GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGASAGCA CTGGAGGAAA GACACAGGTT .GCTGGCATCA 1500 

TCTCTGCTGC GATTGTGATG ATCGCCATTC TTGCCCTGGG GAACCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGSATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTAXAGTGTC CATCATTCTG GGGCTGGAIC TCGGTTTACT AGCTGGCCTT ATATTTCGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTPGGAA TOGOCTTGGA AGCATCCCTA 1800 

GCACAGAXAT CTACAAAAGT ACCAAGAATT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAAXA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AAIGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AACTTGATAT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAG 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC AZAGCCTTGT GCTTGACTGT GGAGCTAXAT 2220 

CTTTCCTGGA C GTTCT TGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGACAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGA3TGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2S20 

AAGAACTTGA TGTCCAGGAT GAGGCTATGC GTACACTTGC ATCCTGAAAG TGGGTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTOCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGOCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TCTAGGCTTT ATTTATAAAA XAAACACCTT AICCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCQ TCCAGGGTAA GCTGGTGTZA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCXAATAATG TTCACGTGGG COCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAJSC GAGTCAOQAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTC ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATITAICAAT AAAAGOCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAAXA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAAIAATC ATGATCACAA CXGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTEAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 3420 

GTGCTATTCT GAGTGAAAAT TTTTTTQATO TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTEATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG ACAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTPCTT TTTATTTATT 3720 

TTTTTCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCOT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTS AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATOCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCGC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

OCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT GATCATACAA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTG G TT A CAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT ACTTGCTTAT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCOCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTAAA ACCOCGTCTC •EACTAAAAAT 4320 

AGAAAAAAAG AAATTAGCCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCIAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGGT 4440 

OGTGCCACTG CACTCCAGCC TGGGCGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 
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AAAAAAAAAA AOAGTGAATQ TAATAGTCTT 0CA6AAAATG 
AAAGGAAA.TA TGCACTGCTC ACTTTTTTGA AGGAAAIGCC 
GGCTAGAGTT TOTAAATTCT GQOTTCATTT GTGATGACAT 
TACTGTCTCT TCTATGTATT TTGTGAAXAG TAAGCAXAAT 
GAAAATTTCA CTTGAAATTA AA6CTGCCTT TTGTTATATT 
TCCAGTATTG TATATGAGTT TTAACAAATT AAAAAATCAA 
TTTGCACACA TTTAAAAATA AATGTAAAGT TGTCTTTTAA 
CTGAACAAAA 



SEP ID NOt232 PFD4 Protein sequence: 

Protein Accession* 043511 



1 11 21 31 41 51 

I I I I I I 

KAAPGGRSEP PQLPEYSCSY HVSRPVYSEL AFQQQHERRL QBRKTLRBSL AKCCSCSRKR 60 

AFGVLKTLVP ILEWLPKYRV KBWLLSDVIS GVSTGLVATL QGMAYALLAA VPVGYGLYSA 120 

FFPILTYFIF GTSHHISVGP PPWSLHVGS WLSMAPDEH FLVSSSNUTV LNTTHIDTAA 180 

RDTARVLIAS ALTLLVGIIQ LIPGGLQIGP IVRYLADPLV GGFTTAAAFQ VLVSQLKIVL 240 

HVSTKNTOOV LSIXYTLVEX FQHIGDWLA DFTAGLLTIV VCMAVKELND RFRHKIPVPI 300 

PIEVIVTIIA TAISYGANLB KNYNAGIVKS IPRGPtPPEL PPVSLFSEKL AASFSIAWA 360 

YAIAVSVGKV YATKYDYTID GNQBFIAFGI SKIPSGFFEC PVATTALSBT AVQESTGGKT 420 

QVAQIISAAI VMIAILALGK LLEPLQKSVL AAWIANLKQ MFHQLCDIPR LWRQNKIDAV 480 

IW V FTCI VS I XLGLDLGUA GLIPGLLTW LHVQPPSWNG LGSIPSTDIY KSTKNYKNIE 540 

EPQGVKILRP SSPIFXGHVD GPKKCIKSTV GFDAIHVYNK RLKALRKIQK LIKSGQLRAT 600 

KNGIISBAVS THNA FEPDE D IEDLEELDIP TKEIEIQVDW HSELPVKVHV PKVPIHSLVL 660 

OCGAISPLDV VGVRSLRVIV KEPQRHJVNV YFASLQDXVX EKLEQCGFFD DKIRKDTFFL 720 

TVHDAILYLQ KQVKSQEGQG SILETITLIQ DCKDTZjETjIE TELTEEELD7 QDEAMRTLAS 780 
QDEAMRTLAS 



SEQ ID N0233PFH20rlA SEQUENCE 

Nuddc Add Accession* HM_016029 

Coding sequence: 228-1097 (uralen1i)edseo^erees correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CTCCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC COTCTTCTTC CCCCCGAGCT 60 

GGGCGTGCGC GGOOGCAATG AACTGG6AGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 

TGCTCCTGCT CTTGGTGCAO CTGCTGCGCT TCCTGA6GGC TGACOCXJGAC CTGACGCTAC 180 

TATGGGCCGA GTGGCAGGGA CGACGCCCAO AATGGGAGCT GACTGATAXG GTGGTGTGGG 240 

TGACTGGAGC CTCGAGTGGA ATTGGTSAG6 ASCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 

TTT C T C TTC T GCTGTCAGCC AGAAGAQTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 

TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 

CTGGTTCCCA TGAAGC6GCT ACCAAAGCTG TTCTCCAGGA GTTTG6TAGA ATCGACATTC 480 

TGGTCAACAA TGOTGGAATS TCCCAGCQTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 

ACAGAAAGCT AATAGAGCTT AACTACTTAJQ GSACGGTGTC CFTGACAAAA TGTGTTCTGC 600 

CTCACATCAT GGACAGGAAG CAAGCSAAAGA TTGTTACTGT GAAXAGCATC CTGGGTATCA 660 

TATCTGTACC TCTTTCCATT GGATACTGTO CTAGCAAGCA TGCTCTCCG3 GGTTTTTTTA 720 

ATGGCCTTCG AACACAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 

GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 

GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTQATGTTAA 900 

TCAGCATOGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAAOCTTTC TTGTTAGTAA 960 

CATATTTQTO GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 

AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 

AGACAAAACA TGACTGAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 

AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
ASAITGCCAT GAATCTTGCA AA 



SEQ ID K&m P FH? PmtPln ranuence: 

Protein Accession I: KP_057113 



1 11 21 31 41 51 

I I I I I I 

KNWELLLWLli VtiCALLLtLV QtXRFLRADG DI/TLLWAHHQ GRMEWELTD HWWVTGASS 60 
GIGEELAYOL SKLGVSLVLS ARRVHELERV KHECLEHGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVLQEFG RIDII.VNNGG MSQRSLCHDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 
KQGKTVTVNS II/GHSVPLS IGYCASKHAL RGPFKGLRTE LATYPGIIVS NICPGPVQSH 240 
IVEKSLACEV TETIGNNGDQ SHKMTTSRCV RLULISKAND LKEVWISEQP FUA/TVUtQY 300 
HPTWAWWITH IM3KKRIENP KSGVUAOSSY FEdFKTKHD 



AATGAATACC TTTOTTCAAT 4560 

AAACTTAOGT TTTACAACAA 4620 

AAGTCAGCAA ACTQCGGGAA 4680 

TTTAGTTTTG TATTATCAAT 4740 

TTTAACCTAT AGGATAAGAT 4800 

ATCATGTACA TTTGAAAATA 4860 

ACTACTCGGA TGTOTCCTTT 4920 



Kudete Add Accession t 



NM_00O45O 



SEQ ID N0235 ACC5 DMA SEQUENCE 
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Coding sequence: Mmbafafi^soofapcauimp^tott^^caiBM) 

1 11 21 31 41 SI 

I I I I I I 

ATGATTGCTT CACAGTTTCT CTCAGCTCTC ACTTTGGTGC TTCTCATTAA AGAGAGTGGA 60 

GCCTGGTCTT ACAACACCTC CACGGAAGCT ATGACTTATG ATGAGGCCAQ TGCTTATTGT 120 

CAGCAAAGGT ACACACACCT GQTTGCAATT CAAAACAAAO AAGAGATTGA GTAOCTAAAC 180 

TCCATATTGA GCTATTCAOC AAGTTATTAC TGQATTGGAA TCAGAAAAGT CAACAATCTG 240 

TGGGTCTGGG TAGGAACGCA GAAACCTCTS ACAGAAGAAG CCAAGAACTG GGCTOCAGCT 300 

GAACCCAACA ATAGGCAAAA AGATGAGGAC TGCGTGGAGA TCTACATCAA GAGAGAAAAA 360 

GATGTGGGCA TGTGGAATGA TGAGAGGTGC AGCAAGAAGA AGCTTGCOCT ATGCTACACA 420 

GCTGCCTGTA CCAATACATC CTGCAGTGQC CACGGTGAAT GTGTAGAGAC CATCAATAAT 480 

TACACTTGCA AGTGTGACCC TGGCTTCAGT GGACTCAAGT GTGAGCAAAT TGTGAACTGT S40 

ACAGCCCTGG AATCCCCTGA GCATGGAAGC CTGQTTTGCA GTCACCCACT GGGAAACTTC 600 

AOCTACAATT CTTCCTGCTC TATCAGCTGT 6ATAGGGGTT ACCTGCCAAG CAGCATGGAQ 660 

ACCATGCAGT GTATGTCCTC TGGAGAAIGS AGTGCTCCTA TTCCAGCCTG CAATGTGGTT 720 

GAGTGTGATG CTGTGACAAA TCCAGCCAAT GGGTTCGTGG AATGTTTGCA AAACCCTGGA 780 

AGCTTCCCAT GGAACACAAC CTGTACATTT GACTGTGAAQ AAGGATTTGA ACTAATGGGA 840 

GOCCAGAGCC TTCAGTGTAC CTCATCTGGG AATTGGGACA ACGASAAOCC AACGTGTAAA 900 

GCTGTGACAT GCA6GGC0GT CCGOCAGCCT CAGAATGGCT CTGTGAGGTG CAGCCATTCC 960 

CCT6CTGGAO AGTTCACCTT CAAATCATCC TGCAACTTCA CCTGTGAGGA AGGCTTCATG 1020 

TTGCAGGGAC CAGCCCAGGT TGAATGCACC ACICAAGGGC AGTGBACACA GCAAATCCCA 1080 

GTTTSTQAAQ CTTTCCAOTG CACAGCCTTG TCCAACCCCG AGCGA6GCTA CATQAATTST 1140 

CTTCCTAGTG CTTCTGGGA6 TTTCCGTTAT GGGTCCAGCT GTGAGTTCTC CTGTGAGCAG 1200 

CGTTTTGTGT TGAAGGGATC CAAAAGGCTC CAATGTGGCC CCACAGGGGA GTGGGACAAC 1260 

GAGAAGCCCA CATOTQAAGC TQTGAGATGC GATGCTGTCC ACCAGCCCCC GAAGGGTTTG 1320 

GTGAQGTGTG CTCATTOOCC TATTGGAGAA TTCACCTACA AGTCCTCTTG TGCCTTCAGC 1380 

1 TGTGAGGAGG GATTTGAATT ATATGGATCA ACTCAACTTG AGTGCACATC TCAGGGACAA 1440 

TGGACAGAAG AGG T TCCTTC CTGCCAAGTG GTAAAATGTT CAAGCCTGGC AGTTCCGGGA 1S00 

AAGATCAACA TGAGCTGCAG TGGGGAGCCC GTGTTTGGCA CTGTGTGCAA GTTCGCCTGT 1560 

CCTGAAGGAT GGACGCTCAA TGGCTCTGCA GCTCOGACAT GTGGAGCCAC AGGACACTGG 1620 

TCTGGCCTGC TACCTAOCTG TGAAGCTCCC ACTGAGTCCA ACATTCCCTT GGTAGCTGGA 1680 

CTTTCTGCTG CTGGACTCTC CCTCCTGACA TTAGCACCAT TTCTCCTCTG GCTTCGGAAA 1740 

TGCTTACGGA AAGCAAAGAA ATTTGTTCCT GCCAGCAGCT GCCAAAGCCT TGAATCAGAC 1800 
GGAAGCTACC AAAAGCCTTC TTACATCCTT TAA 



SEQ ID NOi236 ACC5 Protein sequence: 
Protein Accession ft NPJ0OO441 



1 11 21 31 41 51 

I I I I I I 

MXASQFLSAL TLVLLXKESG AWSYNTSTEA HTYDEASAYC QQRYTHLVAI QKKEBIEYIiN 60 

SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED' CVEIYIKREK 120 

DVGBWNDERC SKKKLALCYT AACTNTSCSG HGECVETIHM YTCKCDPGFS GLKCEQIVNC 180 

TALES PEHGS LVCSHPLGNF SVNSSCSISC DRGYLFSSHE THQCMSSGEW SAPIPACNW 240 

ECDAVTNPAN GFVECFQNPG SFPWNTTCTP DCEEGFELHG AQSLQCTSSG HWDHEKPTCK 300 

AVTCRAVRQP QNGSVRCSHS PAGBFTPKSS CKFTCEEGFM LQGPAGfVECT TQGQWTQQIP 360 

VCEAFQCTAI* SHPERGYKHC LPSASGSFRY GSSCEFSCEQ GFVLKGSKRL QCGPTGEWDN 420 

EKPTC2AVRC DAVHQPFKGL VRCAHSPIGB FTYKSSCAFS CEEGPBLYGS TQLECTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KIHKSCSGEP VFGJTVCKFAC PEGWTLNGSA ARTCQATGHW 540 

SGLLPTCEAP TESNXPLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLBSD 600 
GSYQKPSYIL 

SEQ ID KO-.237 PM28 DHA SEQUENCE 

Kudeic Add Accession I: N51002 

Coding sequence: 1-3793 (undenTned sequences correspond to startand slop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATGTGTG AAGTGATGCC CACGATTAAT GAGGACACOC CAATGAGCCA AAGGGGOTCC 60 

CAAAGCAGTG GCTOGGACTC AGACTCCCAT TTTGAGCAGC TGATGGTGAA TATGCTAGAT 120 

GAAAGGGATC GTCTTCTAGA CACCCTTCGG GAGACCCAGG AAAGCCTCTC ACTTGCCCAG 180 

CAAAGACTTC AGGATGTCAT CTATGACCGA GACTCACTCC AGAGACAGCT CAATTCAGCC 240 

CTGCCACAGG ATATCGAATC CCTAACAGGA GGGCTGGCTG GTTCTAAGGG GGCTGA1CCA 300 

CCGGAATTTG CTGCACTGAC AAAAGAATTA AATGCCTGCA GGGAACAACT TCTAGAAAAG 360 

GAAGAAGAAA TCTCTGAACT TAAAGCTGAA AGAAACAACA CAAGACTATT ACTGGAGCAT 420 

TTGGAGTGCC TTGTGTCACG ACATGAAAGA TCACTAAGAA TGACGGTGGT AAAACGGCAA 480 

GCCCAGTCTC CCTCAGGAOT ATCCAGTOAA GTTGAAGTTC TCAAGGCACT GAAATCTTTG 540 

TTTGAGCAGC ACAAGGCCTT GGATGAAAAG GTAAGGGAGC GACTGAGGGT TTCTTTAGAA 600 

AGAGTCTCTG CACTGGAAGA AGAACTAGCT GCTGCTAATC AGGAGATTGT TGCCTTGCQT 660 

GAACAAAATG TTCATATACA AAGAAAAATG GCATCAAGCG AGGGATCCAC AGAGTCAGAA 720 

CATCTTGAAG GGATGGAACC TGGACAGAAA GTCCATGAGA AGCGTTTGTC CAATGGTTCT 780 

ATAGACTCAA CCGATGAAAC TAGTCAAATA GTTGAACTAC AAGAATTGCT TGAAAAGCAA 840 

AACTATQAAA TGGCCCAGAT GAAAGAACGT TTAGCAGCCC TTTCTTCCCG AGTGGGAGAG 900 

GTGGAACAGG AAGCAGAGAC AGCAAGAAAG GATCTCATTA AAACAGAAGA AATGAACACC 960 

AAGTATCAAA GGGACATTAG GGAGGCCATG GCACAAAAGG AAGATATGGA AGAAAGAATT 1020 

ACAACCCTTG AAAAGCGTTA CCTCAGTGCT CAGAGAGAAT CTACCTCCAT ACATGACATG 1080 
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AATGATAAAC TAGARAATGA GTTAGCAAAT AAAGAAGCTA TCCTACGGCA GATGGAAQAG 1140 

AAAAACAGAC AGTTACAAGA ACGTCTTGAG CTAGCTGAAC AAAAGTTGCA GCAGACCATG 1200 

AGAAAGGCTG AAAGCTTGCC TGAAGTAGAG GCTGAACTGG CTCAGAGAAT TGCAGCCCTA 1260 

ACCAAGGCTG AAGAGAGACA TGGAAATATT GAAQAAGGTA TGAGACATTT AGAGGGTCAA 1320 

CTTGAAGAGA AGAATCAAGA ACTTCAAAGA GCTAGGCAAA GAGAGAAAAT GAATGAGGAG 1380 

CATAACAAGA GATTATCGGA TACGGTTGAT AGACTTCTGA CTGAATCCAA TGAACGCCTA 1440 

CAACTACACT TAAAGGAAAG AAT6GCTGCT CTAGAAGAAA AGAATGTTTT AATTCAASAA 1500 

TCAGAAACTT 5CA6AAAGAA TCTTGAAGAA TCTTTACATG ATAAGGAAAG ATTAGCAGAA 1560 

OAAATTGAAA AGCTGAGAXC TGAACTTGAC CAATT6AAAA TGAGAACTGG CTCTTTAATT 1620 

GAACCCACAA TACCAAGAAC TCATCTAOAC ACCTCAGCTG AGTTGCGGTA CTCAGTGGGA 1680 

TCCCTAGTGO ACAGCCA6TC TCATTACAGA ACAACTAAAC TAATAAGAAG ACCAA6GAGA 1740 

GGCCGCATGG GTGTGCGAAG AGATGAGCCA AAGGTGAAAT CTCTTGGGGA TCACGAGTGG 1800 

AATAGAACTC AACAGATTGG AGTACTAAGC AGOCACCCTT TTGAAAGTGA CACTGAAATG 1860 

TCTGATATTG ATGATGAT6A CAGAGAAACA ATTTTTAQCT CAATGGATCT TCTCTCTCCA 1920 

ASTGGTCATT CCGATGCCCA GACGCTAGOC ATGATGCTTC AGGAACAATT GGATGCCATC 1980 

AACAAAGAAA TCAGGCTAAX TCAGGAAGAA AAAGAATCTA CAGAGTIGCG TGCTGAAGAA 2040 

ATTGAAAATA GAGIGGCTAG TGTGAGCCTC GAAGGCCTGA ATTTGGCAAG GGTCCACCCA 2100 

QGTACCTOCA TTACTGOCTC TGTTACAGCT TCATCGCTGG CCAGTTCATC TCCOCOCAGT 2160 

GGACACTCAA CTOCAAAGCT CACCCCTCQA AGCCCTGCCA GGGAAATGGA TCGGATGGGA 2220 

GTCATGACAC TGCCAAGTGA TCTGAGGAAA CATCGGAGAA AGATTGCAST TGTGGAAGAA 2280 

GATGGTCGAG AGGACAAAGC AACAATTAAA TCTGAAACTT CTCCTCCTCC TACCCCTAGA 2340 

GCCCTCAGAA TGACTCACAC TCTCCCTTCT TCCTACCACA AXGAXGCTGG AAGTAGTTTA 2400 

TCTGTCTCTC TTGAGCCAGA AAGCCTCGGG CTTGGTAGTG CCAACAGCAG CCAAGACTCT 2460 

CTTCACAAAG CCCCCAAGAA GAAAGGAATC AAGTCTTCAA TAGGACGTTT GTTTGGTAAA 2S20 

AAAGAAAAAG CTCGACTTGG GCAGCTCCGA GOC T TTATGG AGACTGAAGC TGCAGCTCAG 2580 

GAGTOCCTGG GGTTAGGCAA ACTOGGAACT CAAGCTGAGA AGGATCGAAG ACTAAAGAAA 2640 

AAGCATGAAC TTCTTCAAGA AGCTCGGAGA AAGGGATTAC CTTTTGCCCA GTGGGATGGG 2700 

CCAACTGTGG TCGCATGGCT ASAGCTTTGG WGGGAAK3C CTGCGTGGTA CGTGGCAGCC 2760 

TGCCGAGCCA ACGTGAAGAG TGGTGCCATC ATGTCTGCTT TATCTGACAC TGAGATCCAG 2820 

AGAGAAATTG GAATCAGCAA TCCACTGCAT CGCTTAAAAC TTCGATTAGC AATCCAGGAG 2880 

ATGGTTTCCC TAACAAGTCC TTCAGCTCCT CCAACASCTC GAACTCCTTC AGGCAACGTT 2940 

TGGGTGACTC ASGAAGAAAT GGAAAATCTT GCAGCTCCAG CAAAAACGAA AGAATCTGAG 3000 

GAAGGAAGCT GGGCCCAGTG TCCGGTTTTT CTACAGACCC TGGCTTATGG AGATATGAAT 3060 

CATGAGTGGA TTGGAAATGA ATGGCTTCCC AGCTTGGGCT TACCTCAGTA CAGAAGTTAC 3120 

TTTATGGAAT GCTTGGTAGA TGCAAGAATC TTAGATCACC TAACAAAAAA ASATCTCCGT 3180 

GTCCATTTAA AAATGGTGGA TAGTTTCCAT CGAACAAGTT TACAATATGG AATTAIGTGC 3240 

TTAAAGAGGT TGAATTATGA CAGAAAAGAA CTAGAAASAA GACGGGAAGC AAGCCAACAT 3300 

GAAAXAAAAG ACGTGTTGGT GTGGAGCAAT GACCGAATTA TTCGCTGGAT ACAAGCAATT 3360 

GGACTTCGAG AATATGCAAA TAATATACTT GAGAGCGGTG TGCATGGCTC ACTTATAGCC 3420 

CTGGATGAAA ACTTTGACTA CAGCAGCTTA ACTTTATTAT TACAGATTCC AACACAGAAC 3480 

ACCCAGGCAA GGCAGATTCT TGAAAGAGAA TACAATAACC TCTTGGCCCT GGGAACTGAA 3540 

AGGCGACTGG ATGAAAGTGA TCACAAGAAC TTCAGACGTG GATCAACCTG GAGAAGGCAG 3600 

TTTCCTCCTC GTGAAGTACA TGGAATCAGC ATGATGCCTG GGTCCTCAGA AACATTACCA 3660 

GCTGGATTTA GGTTAACCAC AAOCTCTGGG CAATCAAGAA AAATGACAAC AGATGTTGCT 3720 

TCATCAAGAC TGCAGAGGTT AGACAACTCC ACTGTTCGCA CATACTCATG TCTCGAGTAA 3780 
GGGGGOGCTT TAA 



SEP ID NQ238 PM28 Protein seouence: 
Protein Accession* none found 



1 11 21 31 41 51 

I I I I I I 

HHCEVHPTIN EDTPUSQRGS QSSGSDSDSH FEQLMVNMLD ERDRLLDTLR ETQESLSLAQ 60 

QRLQDVIYDR DSLQHQLNSA LPQDIESLTG GLAGSRGADP PEFAALTKEL HACREQLLEK 120 

EEEISELKAE RNNTRLLLEH LECLVSKHER SLRMTWKRQ. AQSPSGVSSE VEVLKALKSL 180 

FEHHKALDEK VRERLRVSLE RVSALEEELA AANQEXVALR EQKVHIQRKH ASSEGSTESE 240 

HLEGMEPGQK VHBKRLSNGS IDSTDETSQI VELQELLEKQ HYEMAQHKER LAALSSRVGE 300 

VEQEAETARK DLIKTEEHNT KYQRDIREAH AQKEDMEERI TTLEKRYLSA QRESTSIHDK 360 

HDKLENELAN KEAILRQMEE KNRQLQKRLE LAEQKLQQTO RKAETLPEVE AELAQRIAAt 420 

TKAEERHGNI EERHBffI.EGQ LEEKNQELQR ARQREKHNEE HNKRLSDTVD RLLTESNERL 480 

QLBLKERHAA LEBKNVLIQE SETFEKKLEE SLHDKEBLAE EIEKLRSELD QLKHRTGSLI 540 

EPTIPRTHLD TSAELRYSVQ SLVDSQSDYR TTKVXRRPRR GRMGVRRDEP KVKSLGDHEW 600 

NRTQQIGVLS SHPFESDTEM SDIDDDDRET IFSStlDLLSP SGHSDAQTLA MMLQEQLDAI 660 

KKEIRLIQEE KESTELRAEE IENKVASVSL EGLNLAHVHP GTSITASVTA SSLASSSPPS 720 

GHSTPKLTPR SPAREMDRMG VMTtPSDLRK HRRKIAWEE DGREDKA7XK CETSPPPTPR 780 

ALBMTHTLPS SYHNDARSSL SVSLEPESLG LGSANSSQDS LHKAPKKKGI KSSIGRLFGK 840 

KEKARLGQLR GFMETEAAAQ ESLGLGKLGT QAEKDRBLKK KHELLEEARR KGLPFAQHDG 900 

PTWAWLELW LGHPAWYVAA CRANVKSGAI HSALSDTE1Q REIQISNPLH RLKLRLA1QB 960 

HVSLTSPSAP PTSRTPSGNV WVTHEEMENL AAPAKXKESE EGSHAQCPVF LQTLAYGDHtJ 1020 

HEWIGHEWLP SLGtPQYRSY PMECLVDARM LDHLTKKDLR VHLKMVDSFH RTSLQYGIMC 1080 

LERUreDRKE LERRREASQH EIKDVLVWSN DRI1RWIOAI GLREYAKNIL ESGVHGSLIA 1140 

LDENFDYSSL TLLLQIPTQM TQARQILERE YHNLLALGTE RRLDESDDKH PRRGSTWRRQ 1200 
PPPREVHGIS KKPGSSETLP AGPRLITTSG QSRKHTTDVA SSRLCRLDNS TVRTYSCLE 

SEQ 10 N0239 PCW DNA SEQUENCE 

Nucleic Add Accession i: NM.016S70 

Coding sequence: 1- 1 1 34 (underfilled sequences correspond to start and stop codona) 
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1 11 21 31 41 51 

I I I I I j cn 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA- AGTTTGGTAA AAGAGTTGGA TGCCTTTCCC 60 

AACO TT CC T Q AGAGCTRTGT AGAGACTTCA GCCAGTGGAjG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TAIATCAAGA TACATGGAIG ISO 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGQA GCGGATGTAT TGGATTTAOC AOAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGIAT TTGATCTTTC ACCACAGCAG 360 

AAACAGTGGC AGAG6ATGCT GCAGCTGATT CAOAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TOCTTTTAAA AOTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

1 GATGATTCAT CACAGTCTCC AAATGCATOC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCA0G6A ATTTTCACAT AACAGTCGGC AA6GCAATTC CACATCCTOG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTO 660 

TCTTTTGGAG AGCTTGTTCC A6CAATTATT AATCCTTTAG ATOGAACT6A AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAAXATTTT ATTACAGTTQ TGCCAACAAA ACTACAXACA 780 

TATAAAATAT CA6CA6ACAC CCATCAGTTT TCTGTGACAG AAAGGGAAOG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGQ ATATTTATGA AATATGATCT CAGTTCTCTT 900 

AT6GTGACAS TTACTGAGQA GCACATGCCA TTCTGGCAGT TTTTTGTAA6 ACTCTGTGGT 360 

ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGSAAA ATTEATAGTT 1020 

GAAAXAATTT GCTGTCGTTT CAQACTTCGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GA6GAT6GCC ACACASACAA CCACTTACCT CTTTTAGAAA ATAATACACA TtGA 



SEP ID NO:240 PCM Protein sequence: 
Protein Accession*: NP.0S7654 

1 11 21 31 41 51 

I i I I I I 

HRRLNRKKTL SLVKBLDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWH 60 
KYEYEVDKDF SSKLRIHIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 
KEWQEMLQLI QSRLQBBHSL QDVIFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 
VAGNFHTTVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII HPLDSTEKIA 240 
IDHNQMFQYP ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 
KVTVTEEHMP PWQPPVRLCG IVGGIFSTTG MLEGIGKFIV EIICCRFRLG SYKPVNSVPF 360 
EDGHTDNHLP LLEKNTH 



SEQDH0241 PBA7DNA SEQUENCE 

Nucleic Add Accession*: AA219134 

Cooing sequence 24-1815 (undeiflned sequences correspond to start and slop radons) 



AATTCGCCCT TGCTTAATTA AGCAJ2TTTA CXmCCTGTC ATCTGTCACT GCTGCTOTCA 60 
GTGGCCTCCT GGTGGGTTAT GAACTTGGGA TCATCTCTGG GGCTCTTCI 1 CAGATCAAAA 120 
CCTTATTAGC CCTGAGCTGC CATGAGCAGG AAATGGTTGT GAGCTCCCTC GTCATTGGAG 180 
CCCTCCTTGC CTCACTCACC GGAGGGGTCC TGATAGACAG ATATGGAAGA AGG ACAGCAA 240 
TCATCTTGTC ATCCTGCCTG CTTGGACICG GAAGCTTAGT CTTGATOCTC AGTTTATCCT 300 
ACACGGTTCT TATAGTGGGA CGCATTGCCA TAGGGGTTTC CATCTCCCTC TCTTCCATTG 360 
OCA CI 1U I U 1 TTACATCGCA GAGATTGCTC CTCAACACAG AAGAGGCCTT CTTGTGTCAC 420 
TGAATGAGCT GATGATTGTC ATCGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTG 480 
CCAATGTTn CCATGGCTGG AAGTACATGT TTGGTCTTGT GATTOCCITG GGAGTTTTGC 540 
AAGCAATTGC AATOTATTTT CTTCCTCCAA GCCCTCGGTT TCTGGTG ATG AAAGGACAAG 600 
AGGG AGCTGC TAGCAAGGTT CTTGGAAGGT TAAGAGCACT CTCAGATACA ACTG AGGAAC 660 
TCACTGTGAT CAAATCCTCC CTOAAAGATG AATATCAGTA CAGTTTTTGG GATCTGTTTC 720 
GTTCAAAAG A CAACATGCGG ACCCG AATAA TGATAGGACT AACACTAGTA TTTITTGTAC 780 
AAATCACTGG CCAACCAAAC ATATTOTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 
TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT CCACTGGGGT TGG AGTCGTC AAGGTCATTA 900 
GCACCATCCC TGCCACTCTT CTTGTAGACC ATGTGGGCAG CAAAACATTC CTCTGCATTG 960 
GCTCCTCTGT GATGGCAGCT TCGTTGGTGA CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
TGAACTTCAC CCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATGAGTCTG 1080 
TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAGAGAC CACTTCAAAG 1140 
GGATTTCTTC OCATAGCAGA AGCTCACTCA TGOCCCTGAQ AAATGATGTG GATAAGAGAG 1200 
GGGAGACGAC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TACCAGATAG 1260 
TCACAGACCC TGGGG ACGTC CCAGCTTTTT TGAAATGGCT GTCCTTAGCC AGCTTGCTTG 1320 
TTTATGTTGC TGCTTTTTCA ATTGGTCTAG GACCAATGCC CTGGCTGGTG CTCAGCGAGA 1380 
1C 1 1 10C1GG TGGGATCAGA GGACGAGCCA TGGCTTTAAC TICTAGCATO AACTGOGGCA 1440 
TCAATCTCCT CATCTCGCTG ACATTTTTG A CTGTAACTGA TCTTATTGGC CTGCCATGGG 1500 
TGTGC TTTAT ATATACAATC ATGAGTCTAG ATCTTATTGG CCTGCCATGG GTGTGCTTTA 1560 
TATATACAAT CATOAGTCTA GCATCCCTGC 1111 rGTIOT TATGTTTATA CCTGAGACAA 1620 
AGGGATGCTC TTTGGAACAA ATATCAATGG AGCTAGCAAA AGTGAACTAT GTGAAAAACA 1680 
ACATTTOTTT TATGAGTCAT CACCAAG AAG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1740 
AACCCCAGGA GCAGCTCTTG GAGTOTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 
TTTCTCCAGA GACCTAATGG CCTCAACACC TTCTGAACGT GGATAGTGCC AGAACACTTA I860 
GGAGGGTGTC TTTGGACCAA TGCATAGTTG (X5ACTCCTGT GCTCTCTTTT CAGTGTCATG 1920 
GAACTGGTTT TGAAGAGACA CTCTGAAATG ATAAAGACAG CCTTTAATCC CCCTCCTCMC 1980 
CAG AAGGAAC CTCAAAAGGT AGATG AGGTA CAAGGTCCTA AOTGATCTCT TTTTCTGAGC 2040 
AGGATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGTTTA ATACTTTCTA CCTTCTTCAC 2100 
AGAGCAGCCT TTGAATAG AC TATGTCCTAG TGAAGACATC AACCTCCGCC TTAAGCTATG 2160 
TATGTATGGA GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
GTACAGTTTC TGCCTACCAA OACACTACTT GCACTGGATC TTACGCAAAA AAGAACCAGA 2280 
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ACACACAGTO TOGACAACTG CCCATATATT CTATCTAGAT TAGGAOAGGG TCCT GGCTA O 2340 
GATnTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT AATGCAAAAT ATOCITTTAT GAATTTCATG TTAATATTGT 2460 
GAAATATTAA AATAATTCCR CAATAGTTG A OAAAAATGAG CATTTTTTTC CATTTTTAAA 2520 
AAATGCATAQ AAAAG ACAAT TTTAAAATCC TGGQACCATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGGA GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTAAGCAATT 2640 
AGGTTO AAGT TATTAAGTCA AGOCTAGAAA AGCTGCCTCC TTOTAAGGCT TTCATGACAA 2700 
TGTATAGTAA TOCACAGTGT CCAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TAOGGTACAC AGGCTATAAT TGATQATGAT GTTCAOATAA CTGAAGACAC AATAAATOAC 2820 
ATICAGACAT CAGGAMAA WW CCCTCATCTT CTTTTCTATG ATGGCCACCT GTAC CAGCAA 2880 
CGTGGOTTTC ACCCACACAA CGATQAACTG TTCTCTTACT TCTCCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAAAATTTG GGTATGATAG AAAAWCCACA ATCAAAWCTT 3000 
GAACCAAATA ACATATTAAA TTACTAATAT TTAAGTOATG GAAGACACAC AAAAAACTTA 3060 
AAAGCACGAA CAAOCTAACT TG AAAAAGAA TJTTAAAATA TGATTAACCT GAAOAAAAG A 3120 
GAATOCTAAG AGCCAAAGCT C C1 HI 1 ATT TAGCTTGGAA TTTTCCTATT GGTTCCTAAC 3180 
AAACTGTCCC AATGTCATAT AAGOAAACAT OATCTATTAC ATTCCTTTAT AACAATGTGG 3240 
AGAGACTATA AAOCTATGTA AGTAGTAAAA CTATATYAGA GACTCAGGAG ACIGACTAAA 3300 
AGGCCTGGAT CTGCAGTGTA TTATCTGTAT AAAAATTGGC AGGGGGAAGC TAAAAGGAAA 3360 
GGAGATTGGA GATCTCAATT CTATCATGGT GTATTTCATA CGCAAATCAG AGCATGCATT 3420 
GTnTTTGTT TTTGGAAAQA GAAGGGAAGT GTGITCTGCC CCATOTTTCC TTCCGTGTTT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TnTITCTTI AGCCCTTCAT TATAA ATGGO 3540 
CAGGAAATTG TTTATCAACC TAGCCAGTTT ATTACTAGTG ACCTTGACTT CAGTATCITQ 3600 
AGCATTCTIT TAT A 1 1 1 1 1C TTTTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
AAATTTTTAT CCAATATCCA TTGCACCACA CCAAATCAAG CTTCTTGATT TTCAAAAATA 3720 
AAAAGGGGG A AATACTTACA ACTTGTACAT ATATATTCAC AGTTTTTATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGGAGA GOCAGCAGAA GACATCAGAG CACTCACTTC TTOCCATCTT 3840 
TGTTAAGGTT AGOG AATTAC OCATGGACAC TGTTAGGTG A GGCTCATTCG GCAGCCCTGA 3900 
AAACAAACCT GGTCACACTG TCTTTACCCT CTCCCTTCAG ATAAAGCACT TCGATTATCT 3960 
ATTGATCTGC CCAGTTTTCA AGTCATGCGA ATACTAAAAA GGTTACATCA TCTOGATCTG 4020 
TAGCTTGGCT ATATAAGCAT GTTTTCC CCC TATTCTATGT TTCTTTTTTT GGTGAACATT 408 0 
GAAAAACAGG AGGTQACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAAGTCTTT 4140 
AAAACAGTGA GCTTGTAACT CTTTCATGTA ATTTTATTCT CTATGAATTT GGCTATCCTA 4200 
CTGAATCTTA AAATAAAGGA AATAAACACT TTTTTTTWAA AAAAAAGGAA AAATAMAARW 4260 
MWAAAAATCT CAATGAAATA TTTCACAAGA AGGAAAAA 



SEQ ID N0342 PBA7 Protein sequence: 



Protein Accession*: AAF91431 

MFTFLSSVTA AVSGULVGYE LGHSGALLQ KTLLALSCH BQEMWSSLV IGALLASLTG 60 
GVUDRYGRR TAIILSSCLL GLGSLVULS LSYTVLIVGR 1AIGVSISLS SIATCVYIAE 120 
IAPQHRRGU, VSLNELMIVI GILS AYISNY AFANVFHGWK YMFGLVIPLG VLQAIAMYFL 180 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTVKSSL KDEYQYSFWD LFRSKDNMRT 240 
RMGLTLVF FVQITGQPNI LFYASTVLKS VGFQSNEAAS LASTGVGWK VISTIPATLL 300 
VDHVGSKTFL CIGSSVMAAS LVTMGIVNLN IHMNFIHICR SHNSINQSLD ESVIYGPGNL 360 
STNNNTLRDH FKGESHSRS SLMFLRND VD KRGETTSASL LNAGLSHTEY QIVTDPGD VP 420 
AFLKWLSLAS LLVYVAAFSI GLGFMFWLVL SED7GGIRG RAMALTSSMN WGINLLBLT 480 
FLTVTDLIGL PWVCHYTM SLDLIGLPWV CFIYTIMSLA SLLFWMHP ETKGCSLBQI 540 
SMELAKVNYV KNMCEMSHH QEELVPKQPQ KRKPQEQLLE CNKLCGRGQS RQLSPET 

$EQIDH(M43PA B4DHAseoiience: 
KdeteAddAccesslonl: AA1720S6 

Coding sequence: 121-339 (underlined sequences correspond to start and stop coaons) 

TTTAGCCACC AGAGGANTTC TCTTGAAATA CCCAAAATCC ATCAGTATCT TGAATCATGC 60 
TGGATTTTGA AGAATTCTTA AGAAGCCATG TAAAGGGGGC TCTCTGGOCT TGAAATAGTG 120 
ATGTTTTTTA TACAGAAAGG AGAATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATTT 180 
GATTTCAAGA AATTACAGGA AAACTTTCCA AAGT TCCAT C TCACAGAANN TTATTTTNCC 240 
AAGAATTCCA AGATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAGAC ATCGACAGAT GATTACATCA CTTATAGTTC TAGTAAATTT ATTAATATAA 360 
AACTCAG AGA CATTCCAATA TCCACATTGC TTACACCATT AGGCATAG AT TCAGTGTCAG 420 
CTATQACAAT TGAAAATG AG CTGTTTTGTG ATTTAAAGGT TTAAATTTCT CTAACCAAAC 480 
TGCTTOATCC AGATGCAGGA CTGCAAATGT TAATATTTGT T CTGGA AGAA CAATCAAATA 540 
AGACTTAAGA GQAAAGGGAA TGGCCACAAT CCACCTGAAA TTTTTTCTTA AAAAGTQTGC 600 
AGCCTACTAA ATCAGAATGA AAATAGAAGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTACCTAAAG TTATTTCATC TG AAAATTTC AAGCAACTTT GTTCAACATT 720 
AAATTGACAA TCTAAACTAA CAAGTCTTTT GAATTTATGC ATGGTAGTAA ACATTCTCTC 780 
TATTAACTTT ATTACCTAAG GCTAAACCTA AAATTTTTAA GCAAAATTAO AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAATAAA ATTGGTCGTT 900 
CTTGGTTTTT TATTTGQAGA GTCTGTGCAA AATGTCACTA AAAAT AAATT AGCACT AG AA 960 
ATTATTTCTA AATACCAAA 

SEQ ID N0244 PBQ8 DNA SEQUENCE 

Nucleic Acid Accession*: X5140S 

Coding sequence: 3-1721 (undefined sequence conesponds to start and stop codon) 
1 11 21 31 41 51 
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AAATGGCGTG CCCGTCTCTC CGCCGGCGGC CTGCCTCGCA 0T66TTTCTC CTGCAOCTCC 60 

CCTGGGCTCC GOGGCCAGTA GTGCAGCCCC TGGAGCOGCG GCTTTGCOCG TCTCCTCTGO 120 

GTGGCCCCAG TGGGCGGGCT GACACTCATT CAGGCGGGGA AGGTGAGGCG AG7AGAGGCT 180 

ggtgoggaac ttgccgcccc cagcao0g0c ggcggcctaa gcccagggcc gggcagacaa 240 

aagaqgccgc ccgcgtagga aggcacggcc ggcggcggcg gagcgcagcg atggcccggc 300 

GAGGGGGCAG CGCGCTGCTG GCTCTOTGOO GGQCACTGGC TGCCTGCGGG TGGCTCCTGG 360 

GCGCCGAAGC CCAOGAGCCC GCGGCGCCCG CGGCGGGCAT GAGGCGGCQC CGGCGGCTGC 420 

AGCAAGAGGA CGGCATCTCC TTCGAGTACC flOCOCTSOOC CGAGCTGCGC GAGGCGCTCQ 480 

TGTCCGTGTG GCTGCAOTGC ACCGCCATCA GCAGGATTTA CACGGTGCGG CGCAGCTTCG 540 

AflGGCCGGOA GCTCCTGGTC ATCGAGCTQT CCGACAACCC TOGCGTOCAT GAGCCTGGTG 600 

AGCCTGAATT TAAATACATT GGGAAXATQC ATGG6AAT6A GGCTGTTGGA OGAGAACTGC 660 

TCATTTTCTT GGCCCAQTAC CTATGCAACG AATACCAGAA GGGGAACGAG AC AATTG TCA 720 

ACCTGATCCA CAGTACCCGC ATTCACATCA TGCCTTCCCT GAACCCAGAT GGCTTTCAGA 780 

ASGCAGCGTC TCAGCCTGGT GAACTCAAGG ACTGGTTTGT GGGTCGAAGC AATGCCCAGO 840 

GAA3EAGATCT GAACGGGAAC TTTCCAGACC TGGATAGGAT AGTGTACGTG AATGAGAAAG S00 

AAGOTGGTCC AAATAATCAT CTGTTOAAAA AIATGAAGAA AATTGTGQAT CAAAACACAA 960 

AGCTTGCTCC TGAGACCAAG GCTGTCATTC ATTGCATTAT GGATATTCCT TTTGTGCTTT 1020 

CTGCCAATCT CCAXGGAGGA GAOCTTGTGG CCAATTATOC ATATGATGAG ACGCGGAGTO 1080 

GTW3TGCTCA CGAATACAGC TCCTCCOCAS MGAOGCCAT TTTCCAAAGC TTGGOGGGGG 1140 

CATACTCTTC TTTCAACCCG GCCATGTCTG ACCCCAATCG GCCACCATGT CGCAAGAATG 1200 

ATGATGACBG CAGCTTTGTA GATQGAAOCA CCAACGGTGG TGCTTGGTAC A60GTACCTG 1260 

QAGGGATGCA AJ3ACTTCAAT TACCTTAGCA GCAACTGTTT TQAGATCACC GTGGAGCTTA 1320 

GCTGTGAGAA GTTCCCACCT GAAGAGACTC TGAAGACCTA CTGG GAGGAT AACAA AAAC T 1380 

CCCTCATTAG CTACCTTGAG CAGATACACC GAOSAGTTAA AGGATTTGTC CGAGAOCTTC 1440 

AAGGTAACCC AATTGCQAAT GCCACCATCT CCGTGOAAGG AAXAGAOCAC GATGTTACAT 1500 

CCGCAAAGGA TGGTGATTAC TGGAGATTGC TTATACCTGG AAACTATAAA CTTACAGCCT 1560 

CAGCTCCAGG CTATCTGGCA ATAACAAAGA AAGTGGCAGT TCCTTACAGC CCTGCTGCTG 1620 

GGGTTGATTT TGAACTGGAO TCATTTTCTG AAAGGAAASA AGAGGAGAAG GAAGAATTGA 1680 

TGGAATGGTG GAAAATGATG TCAGAAACTT TAAATTTTTA AA AAGGCTTC TAGTTAGCTG 1740 

CTTTAAATCT ATCTATATAA TGTAGTATGA TGTAATGTGG TCTTTTTTTT AGATTTTGTG 1800 

CAQTTAMAC TTAACATTGA TTTATTTTTT AATCATTTAA ATATTAATCA ACTTTCCTTA 1860 

AAAEAAATAG CCTCTTAGGT AAAAATATAA GAACTTGATA TATTTCATTC TCTTATATAG 1920 

TATTCATTTT CCTACCTATA TTACACAAAA AAGTATAGAA AAGATTIAAG TAATTTTGCC 1980 

ATCCTAGGCT TAAATGCAAT ATICCTGGTA TTATTTACAA TGCAGAATTT TTTGAGTAAT 2040 

TCTAGCTTTC AAAAATTAGT GAAGTTCTTT TACTGTAATT GGTGACAATG TCACATAATO 2100 

AATGCTATTG AAAAGGTTAA CAGATACACC TCGGAGTTGT GAGCACTCTA CTGCAAGACT 2160 

TAAATAGTTC AGTATAAATT tf lW i' l TTl'l' TCTTGTGCTS ACTAACTATA AGCATGATCT 2220 

TGTTAATGCA TTTTTGATGG GAAGAAAAGG TACATGTTTA CAAAGAGGTT TTATGAAAAG 2280 

AATAAAAATT GACTTCTTGC TTGTACATAT AGGAGCAATA CTATTATATT ATGTAGTCCG 2340 

TTAACACTAC TTAAAAGTTT AGGGTTTTCT CTTGGTTCTA GAGTGGCCCA GAATTGCATT 2400 
CTGAATGAAT AAAGGTTAAA AAAAAATCCC CAGTGAAAAA AAA 

SEQ ID N034S P8Q8 Prolyl sequence 
Protein Accession*-. P16870 

MAGRGGSALL ALCG ALAACG WLLCAEAQEP G APAAGMRRR RRLQQEDGIS FEYHRYPELR 60 
EALVSVWLQC TABRIYTVG RSFEGRELLV IELSDNPGVH EPGEPEFKYI GNMHGNEAVG 120 

RELUFLAQY LCNEYQKGNE TTVNUHSTR IHIMFSLNPD GFEKAASQPG ELKDWFVGRS 180 
NAQOIDLNRN FPDLDRIVYV NEKEGGPNMH LLKNMKKIVD QNTKLAPETK AVIHWIMDIP 240 
FVLSANLHGG DLVANYPYDE TRSGSAHEYS SSPDDAIFQS LAEAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTTNGOAWY SVPGGMQDFN YLSSNCFOT VELSCEKFPP EETLKTYWED 360 
NKNSLBYLE QHRGVKGFV RDLQGNPIAN ATBVEGIDH DVTSAKDGDY WRLLIPGNYK 420 
LTASAPGYLA ITKKVAVPYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SETLNF 



SEP ID H03<6 PBY 4 DMA 

Nudac Add Accession* AF038966 

Coding sequence: 81-1107 (underlined sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

III III 

GGGGCGACOT GAGCGOGCAO GGGGGCGGCG GCCTCGCCTC GTCTCTCTCT CTG CCCCT GG 60 

GTCGGGTGGG TGACGCCGAG AGCCAGAGAG ATGTCGGATT TCGACAGTAA CCCGTTH3CC 120 

GACCCGGATC TCAACAATOC CTTCAAGGAT CCATCAGTTA CACAAGTGAC AAGAAATGTT 180 

CCACCAGQAC TTOATGAATA TAATCCATTC TCGGATTCTA GAACACCTCC ACCAGGCGGT 240 

GTQAAGATGC CTAATGTAOC CAATACACAA CCAGCAATAA TGAAACCAAC AGAGGAACAT 300 

CCAGCTTATA CACAGATTGC AAAGGAACAT GCATTGGCCC AAGCTGAACT TCTTAAGCGC 360 

CAAGAAGAAC TAGAAAGAAA AGCCGCAGAA TTAGATCGTC GGQAACGAGA AATGCAAAAC 420 

CTCAGTCAAC ATGGTAGAAA AAATATTTGG CCACCTCTTC CTAGCAATTT TCCTGTCGGA 480 

CCTTGTTTCT ATCAGGAATT TTCTGTAGAC ATTCCTGTAG AATTCCAAAA GACAGTAAAG 540 

CTTATOTACT ACTTOTCGAT GTTCCATGCA GTAACACTGT TTCTAAATAT CTTCGGATGC 600 

TTGGCTTGGT TTTGTGTTGA TTCTGCAAOA GCGGTTGATT TTGGATTGAG TATCCTGTGG 660 

TTCTTGCTTT TTACTCCTTQ TTCATTTGTC TGTTGGTACA GACCACTTTA TGGAGCTTTC 720 

AGGAGTCACA GTTCATTTAG ATTCTTTGTA TTCTTCTTCG TCTATATTTG TCAGTTTGCT 780 

GTACATGTAC TCCAAGCTGC AGGATTTCAT AACTGGGGCA ATTGTGGTTG GATTTCATCC 840 

CTTACTCGTC 1CAACCAAAA TATTCCTGTT GGAATCATGA TGATAATCAT AGCAGCACTT 900 

TTCACAGCAT CAGCAGTCAT CTCACTAGTT ATGTTCAAAA AAGTACATGG ACTATATCGC 960 

ACAACAGGTG CTAGTTTTGA GAAGGCCCAA CAGGAGTTTG CAACAGGTGT GATGTCCAAC 1020 

AAAACTGICC AGACCGCAGC TGCAAATGCA GCTTCAACTG CAGCATCTAG TGCAGCTCAG 1080 
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AATGCTTTCA AGGGTAACCA GATCTAAGAA TCTTCAAACA ATACACTGTT ACCTTTTGAC 1140 

TGTACCTTTT TCTCCAOTTA CTGTATTCTA CAAATATTTT TATQTTCAAA ftCACACAGTA 1200 

CMACAGCAT GGATATTTCC TGTTCACTTG TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 

GTCTTATTAC T1TACCTAAT AGTTTCTTAA TATTTCAGTG OCCCTTQCAQ AAAAAATATT 1320 

ACATGCTAAA TAAATATTCT CCATATTTTT QGGGGATGRC ATTCAGTGAA TTATTTCACT 1380 

GGTGACCCAC TOAAAATTAA TAATOQIACT TATGATTAAA AACGCATTTA ATACTAACTG 1440 

CAGTAGTTCT TTCAAGAATC TTTAGAGATA AGGATTGCSC ATTOGAAAAG TAAACCATGT 1S0O 

TTCATTCCTT TTTCCCTATT TATATTGAAA GAAATAGGCC AGCAGAGACT TAGGGATTTT 1560 

AAATTGGCTT GCTTTTTAGC TGTTTCAGTC ACCA6TGAAG AGCCTATGTO CATTT TGTAG 1620 

TAGATAATGT ARAMTTGTC ATCTTTTTCT TTTCTTTTTT TTAGAATAGC TGATATTTTQ 1680 

ATAACAATCT CTAATTTGCA TGGGCACCAC ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 

GCTTCTGTAC TGCTTATGGT TOTAGGATTC AQGGGTOAAT GGAATCACAG AAATGATATT 1800 

CTGCAROAAT TTCTTTTAAA TAAAAAOTTT GGGGQTGCAA TATAAGAAGT TTATATAATA 1860 

TGCAGTACAT TATCCAAAAG AGAAGGTAGT TAATGCAGTA GAAAQTAGTG GTAATAATTC 1920 
C1TAT1 1 



SEP ID HO: 247 PB V4 Protein sequence: 

Protein Accessions: 

MSDFDSNPFA DPDLNNPFKD PSVTQVTRNV FPGLDEYNPF SDSRTPPPGG VKMPNVPNTQ 60 
PAIMKPTEEH PAYTQIAKEH ALAQAELLKR QEELERKAAE LDRREREMQN LSQHGRKNIW 120 
PPLPSNFPVG PCFYQEFSVD IPVEFQKTVK LMYYLWMFHA VTLFLNIFGC LAWFCVDSAR 180 
AVDFGLSILW FLLFTPCSFV CWYRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQAAGFH 240 
NWGNCGWISS LTGLNQNIPV GIMMMAAL FTASAVISLV MFKKVHGLYR TTOASFEKAQ 300 
QEFATGVMSN KTVQTAAANA ASTAASSAAQ NAFKGNQI 



SEP) ID NO:248 PBH2 DNA saroence 

Nucklc Acid Accession?: none found 

Coding sequence: 1-813 (underlined sequence corresponds to start aid slop codon) 



ATGAGAGACA ATAAATCGTG TGCTTTTTTC ATGGGAAAGT TAAATGTTTG TTTTG AAGGC 60 
ACAGTAATAG CAGGCTATTC AGTCTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 120 
AGTGCACTAC AATCTCCTAA AAAGTCTTCT CACCCTCACA GGACTGCTCT ACATCTGGCC 180 
TCTGCCAATG GAAATTCAGA AGTAGTAAAA CTCCTGCTGG ACAGACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTG ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTCCTGGA ACATGGCACT GATCCGAATA TTCCAGATGA GTAT GGAAA T 360 
ACCGCTCTAC ACTATGCTAT CTACAATGAA GATAAATTAA TGGCCAAAGC ACTGCTCTTA 420 
TACGGTGCTG ATATCGAATC AAAAAACAAG CATGGCCTCA CACCACTGTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TTAATCAAGA AAAAAGCAAA TTTAAATCCA 340 
CTGGATAGAT ATGGAAGGTG TGTGACCTTG GGAACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGTAG 



SEP. ID H0349 PBH2 Protein sequence: 
Protein Accession t. nonelound 

MRDNKSCAFF MOKLNVCFEG TVIAGYSVFA TTCHHLAVA S AljQFPKKSS HPHRTALHLA 60 
SANGNSEWK 1XLDRRCQLN ILDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKAIXL YGADESKNK HGLTPIXLGV HEQKQQWKF UKKKANLNA 180 
LDRYGRCVTL GTLFTTKYW IYEK 



SEQ ID HO250 PBJl DMA sequence 
HuddcAddAccessionl: XM.O05829 

Coding sequence: M043(undenTned sequence corresponds to startand stop codon) 

ATQGTGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTGAT TGACATTCAG TTTGCAACAG GAAAGGTTAC TCAGCCGGGA 120 
GAGGACACTT CCTACCATCA ATGCGCTCAG CTTGAAGCCA GAGACGAAGG CACCGACAGT 180 
TTATTATTAA ACAATGGCAG CAGCOCCACG CTGAAGACAC GAACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATOG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAAACGAAG 300 
ATCAGGAGCA GATTTGAAGA ATTACAAAGT G AATTGGTGC CAGTCAGCAT GTCAGAGACA 360 
GACCACATAG CCTCTACITC CTCTGATAAA AATGTTGGGA AAACACCTGA ATTAAAGGAA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGCAGCAAAT TAGAAAATGA GTCCAAACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG A GCATAA TAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GG AGGT GAGG ATTCTTGTGC CAAAACAGAC 600 
ACAGGCTCAG AAAATTCTGA ACAAATAGCT AATTTTCCTA GTGGAAATTT TGCTAAACAT 660 
ATTTCAAAAA CAAATGAAAC AGAACAGAAA GTAACACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TCCACCAAGA AATTTATTrC AAAAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAGAAT CTGAGCTCTT ATCTACGGAG TTTGCAGAAC ATCGAGTACC AAATGGAATG 900 
AATAAGGOAG AACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAO 960 
CAGGAACATA TCATAAAAAA GTTAATTAAA GAAAATAAGA AGCATCAGGA GCTCTTCGTA 1020 
GACATTTGTT CAGAAAAAGA CAATTTAAGA GAAGAACTAA AGAAAAGAAC AGAAACTGAG 1080 
AAGCAGCATA TGAACACAAT TAAACAGTTA GAATCAAGAA TAGAAGAACT TAA TAAAG AA 1 140 
GTTAAAGCTT CCAGAGATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGCAGTTCAG 1200 
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CAOTTACACA AAGAGATGGC CCAACGGATG GAACAGGCCA ACAAGAAATO TO AAGA GGCA 1260 
CGOCAAGAAA AAGAAGCAAT COTAATGAAA TATGTAAGAG GTGAGAAGGA ATCTTTAG AT 1320 
CnCOAAAGG AAAAAGAGAC ACTTGAGAAA AAACTTAGAG ATGCAAATAA GGAACTTGAG 1380 
AAAAACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG OACGGTTGCA CCAGCTOTAT 1440 
S GAAACTAAGO AAGGCGAAAC GACTAGACTC ATCAGAGAAA TAOACAAATT AAAGGAAOAC 1500 
ATTAACTCTC ACGTCATCAA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGO AAACCAAAGA TAAACTCAAA GAAACAACAA CAAAATTAAC ACAAGCAAAG 1620 
GAAGAAGCAG ATCAGATACG AAAAAACTOT CAGGATATGA TAAAAACATA TCAGGAGTCA 1680 

... GAAGAAATTA AATCAAATGA GCTTGATGCA AAGCTTAGAG TCACAAAAGG AGAACTTOAA 1740 

10 AAACAAATGC AAGAAAAATC TGACCAGCTA GAGATGCATC ATGCCAAAAT AAAGGAACTA 1800 
OAAGATCTG A AG AG AACATT TAAGGAGGGT ATGGATG AGT TAAG AACACT GAGAACAAAG 1860 
GTGAAATGTC TAG AAGATGA ACGATTAAGA ACAGAAGATG AATTATCAAA ATATAAGGAA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTOAA AACTGCAOAT 1980 
CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAGAAATTG AAAATTTGAA AGAAGAAGTG 2040 

15 GAAAGTCTTA ATTCTTTGAT TAATGACCTA CAAAAAGACA TCGAAGGCAG TAGGAAAAGA 2100 
GAATCTGAGC TGCTGCTGTT TACAGAAAGG CTCACTAGTA AG AATGCACA GCTTCAGTCT 2160 
GAATCCAATT CTTTGCAGTC ACAATTTGAT AAAGTTTCCT GTAGTGAAAG TCAGTTACAA 2220 
AGCCAGTGTG AACAAATGAA ACAGACAAAT ATTAATTTGG AAAGTAGGTT GTTGAAAGAG 2280 
GAAGAACTGC GAAAAGAGGA AGTCCAAACT CTGCAAGCTG AACTCGCTTG TAGACAAACA 2340 

20 GAAGTTAAAG CATTCAGTAC CCAGGTAGAA GAATTAAAAG ATQAGTTAGT AACTCAGAGA 2400 
OGTAAACATG OCTCTAGTAT CAAGG ATCTC ACCAAACAAC TTCAGCAAGC ACQ AAGAAAA 2460 
TTAGATCAGG TTGAGAGTGG AAGCTATG AC AAAGAAGTCA GCAGCATGGG AAGTCGTXCT 2520 
AGTTCATCAG GGTCCCTGAA TGCTCGAAGC AGTGCAGAAG ATCGATCTCC AGAAAATACT 2580 
GGGTCCTCAG TAGCTGTGGA TAACTTTOCA CAAGTAGATA AGGCCATGTT GATTGAGAGA 2640 

25 ATAGTTAGGC TGCAAAAAGC ACATGCCCGG AAAAATGAAA AGATAGAATT TATGGAGGAC 2700 
CACATCAAAC AACTGGTGGA AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
7TACGAGAAG AATCAGGCAC ACTTTCI 1CA GAGGCATCTG ATTTTAACAA AGTTCATTTA 2820 
AGTAQACGGG GTGGCATCAT GGCATCTTTA TATACATCCC ATCCAGCTGA CAATGGATTA 2880 , 
ACATTGGAGC TCTCTTTGGA AATCAACCGA AAATTACAGG CTGTTTTGGA GGATACGTTA 2940 

30 CTAAAAAATA TTACTTTGAA GGAAAATCTA CAAACACTTG GAACAGAAAT AGAACGTCTT 3000 
ATTAAACACC AGCATGAACT AGAACAGAGG ACAAAGAAAA CCTAAAACAA GCCTCTTGCT 3060 
CAGTAAAGAG ACAAAAGCCA CACAGGAGTA GGTGCCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTOCA C IT TTTOn 1C AO CCAGTAAAAA T ATTOTTT TG CTTCATCTGT ACACAAAAAA 3180 
ATACCCTTTT ACAATATGAA TGCATTGCTC TATATACTGT AAGACTGAAA GCTTTGATGA 3240 

35 AATTTGTTTT TGTATGGTGC AATATGACAG CCTGTCATTG AATCTAAACA ACTTAATTTG 3300 
CTTCTATTCA TAAG AAGTGT TG AACATTAC AAGGGCTTTT AT 



SEQ ID N03S1 PBJ1 Protein sememe: 
40 Protein Accession t: KP_060487 

MVHYLSFCN YYMEFYREEL PHIDYIJDIQ FATGKVTQPG EDTSYHQCAQ LEARDEGTDS 60 
LIXNNGSSAT LKTRTRCYGT PRGLPHRSLL QPTPPICTTI K KSREEELQS ELVPVSMSET 120 
DHIASTSSDK NVGKTPELKE DSCNLFSGNE SSKLENESKL LSLNTDKTLC QPNEHNNRIE 180 

45 AQENYIPDHa GGEDSCAKTD TGSENSEQIA NFPSGNFAKH ISKTNETEQK VTQILVELRS 240 
STFFESANEK TYSESPYDTD CTKKFISKIK SVSASEDLLE EIESELLSTE FAEHRVPNG M 30 0 
NKGEHALVLF EKCVQDKYLQ QEHnKKLIK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTIKQL ESREELNKE VKASRDQUA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 

„ RQEKEAMVMK YVRGEKESLD LRKEKETLEK KLRDANKELE KNTNKDCQLS QEKGRLHQLY 480 

50 ETKEGETTRL KHDKLKED 1NSHVHCVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQUUCNC QDMKTYQES EEKSNELDA KLRVTKGELE KQMQEKSDQL EMHHAKDCEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE IINRQKAEIQ NLLDKVKTAD 660 
QLQEQLQRGK QEIENLKEEV ESLNSIJNDL QKDtEGSRKR ESELLLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSESQLQ SQCEQMKQTN INLESRLLKE EELRKEHVQT LQAELACRQT 780 

55 EVKALSTQVE ELKDELVTQR RKHASSIKDL TKQLQQARRK tOQVESGSYD KEVSSMGSRS 840 
SSSGSLN ARS S AEDRSPENT GSSVAVDNFP QVDKAMUER IVRLQKAHAR KNEHEFMED 900 
HUCQLVEEK KKTKQQSYI LREESGTLSS EASDFNKVHL SRRGGIMASL YTSHPADNGL 960 
TLELSLEINR KLQAVLEDTL LKNTTLKENL GTLGTEIERL IKHQHELEQR TKKT 

60 

SEQ 10 N0352 PBJ6 DMA seauence 
NudefcAddAccessta* D83760 

GxSng, sequence 56-1459 (undeiilned sequence cwresponds !o start and slop codon) 

65 1 11 21 31 41 51 

I I I I I I 

TTGCCGTGAA GGGCTGTGCG GTTCCCOTGC GCGCCGGAGC CTGCTGTGGC CTCTTATGCA 60 

CTCCACCACC OCCATCAGCT CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTGCT 120 

AGGCTGGAAG CAAGSAGATG AAGAGGAAAA GTGSCCAGAQ AAGGCAGTGG ACTCTCTAGT 180 

70 GAAGAAGTTA AAGAASAAGA AGGGAGCCAT GGACGAGCTG GAGAGGGCTC TCAGCTGCCC 240 

GGGGCAGCCC AGCAAATGCG TCACGATTCC CCGCTCCCTG GACGGGCGGC TGCAGGTGTC 300 

CCACCGCAAG GCCCTGCCCC ATGTGATTTA CTGTCGCGTG TGGCGCTGGC CGQATCTGCA 360 

CTCCCACCAC GAGCTGAAGC CGCTGGAGTG CTGTGAGTTC CCATTTGGCT CCAAGCAGAA 420 

_ AGAAGTGTGC ATTAACCCTT ACCACTACCG CCGGGTGGAG ACTCCAGTAC TGCCICCTGT 480 

75 OCTCGTGCCA AGACACAGTG AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAG 540 

CGCCTCCCTG CACAGTGAGC CACTCATGCC ACACAACGCC ACCTATCCTG ACTCTTTCCA 600 

GCAGOCTCCQ TGCTCTGCAC TCCCTCCCTC ACCCAGCCAC GCGTTCTCCC AGTCCCCGTG 660 

CACGGCCAGC TACCCTCACT CCCCAGGAAG TCCTTCTGAG CCAGAGAGTC CCTATCAACA 720 

CTCAGTTGAC ACACCACCCC TGCCTTATCA TGOCACAGAA GCCTCTGAGA CCCAGAGTGG 780 
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CCAACCTQTA GATGCCACAG CTGATAGACA TCTAGTGCTA TCGATACCAA ATGGAGACTT 840 

TCGACCAOTT TGTTACGAGG AGCCCCAGCA CTGOTOCTCG GTCGCCTACT ATGAACTGAA 900 

CAACCGAGTT GGGGAGACAT TCCA GGC TTC CTOCCGAAGT GTGCTCATAG ATGGGTTCAC 960 

CGACCCTTCA AATAACAGOA ACAGATTCTG TCTTGGACTT CTTTCTAATG TAAACAGAAA 1020 

CTCAACGATA GAAAATACCA GGAGACATAT AGGAAAGGGT GTGCACTTGT ACTACGTC6G 1080 

GGGAGACGTG TATGOCGAGT GCGTGAGTGA CAGCAGCATC TTTGTGCAGA GCCGOAACTG 1140 

CAACTATCAA CACGGCTTCC ACCCAGCTAC CGTCTGCAAH ATCCCCAGCG GCTCCAGCCT 1200 

CAAGGTCTTC AACAACCAGC TCTTCGCTCA GCTCCTGGCC CAGTCAGTTC ACCACGGCTT 1260 

TGAAQTCGTG TATGAACTGA CCAAGATGTG TACTATCCGG ATGAGTTTTG TTAAGGGTTG 1320 

GGGTGCTGAG TATCATCGCC AGGATGTCAC CAGCACCCCC TGCTGGATTG AGATTCATCT 1380 

TCATOGGCCA CTGCAGTGGC TGGACAAAGT TCTGACTCAG ATGGQCTCTC CACATAACCC 1440 
CATTTCTTCA GTQTCTTAAC AGTCATCTCT TAAGCTGCAT TTCCATAGGA T 



SEP ID K03S3 PBJ6 Proton secuencK 
Protein Accession #: KP_005898 

MHSTTPISSL FSFTSPAVKR UjGWKQGDEE EKWAEKAVDS LVKKUCKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPHV IYCRVWRWPD LQSHHELKPL ECCEFPFGSK 120 
QKEVONPYH YRRVEIPVLP PVLVPRHSEY NPQLSLLAKF RS ASLHSEPL MPHNATYPDS 180 
PQQPPCSALP PSPSHAF5QS PCTASYPHSP GSPSEPESPY QHSVDTPPLP YHATEASBTQ 240 
SGQPVDATAD RHWLSIPNG DFRPVCYEEP QHWCSVAYYE LNNRVGETFQ ASSRSVLIDG 300 
FIDPSNNKNR FCLGIXSNVN BNSTENTRR HIGKGVHLYY VGOBVYABCV SDSSIFVQSR 360 
NCNYQHGFHP ATVCfOPSGC SLKVFNNQLF AQLLAQSVHH GFEWYELTK MCTKMSFVK 420 
OWGAEYHRQD VTSTPCWIEI HLHGPLQWLD KVLTQMGSPH NPISSVS 



SEQ H) NQ354 PBJ8 DMA sequence 
Nui^AcM Accession*: AB04684 

Cooinj sequence 472-4377 (undetflned sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

TGCAGGTTTG CAGGGTCTGA GATTACTTGG GCTTTTCCTG CCTTTTTCTT TTGCTTAAGG 60 

GATGGACAAO GAGCTGAGAT TTATQACCCT TATTAGAGAA AAAAATGTGC CTTGCTAGGG 120 

TGGGGACACT TGGTTGATGC AGTCTCTCTC TCTCTTTCTC GGTGTTTATA ACAAAACAAA 180 

ACCAAAATGA ACTGAGGGGT TTGTAATGGT AGTTTGTTTG TTGCTGGAGA ATGCTACTTT 240 

GCATGCTTTT TTTCTCTTGC AGGGTATCTT CTGTCTTGTG CTTTTTCTTT TAGAAGCTAC 300 

TAAAGGGTGT TGGGGATGCT TCTGACTATT ATGAAGGCCA AAAGGCCTGT TGACTGGGGC 360 

TGCTTTTAAC CCTTTCCTAT TTGCTGAGAA TGCAGCCGTG TGACAGTAAC TGAACATTGG 420 

TCTAAAGTCT TTCCAAAAGG TCAAGGTTCA CAAGAACATC TGCTCAAATT AATGACCATG 480 

GGGGATATGA AGACCCCAGA CTTTGATGAC CTCCTGGCAG CATTTGACAT CCCAGATATG 540 

GTCGATCCTA AAGCAGCTAT TGAGTCTGGA CACGATGACC ATGAAAGCCA CATGAAGCAG £00 

AATGCTCACG GAGAGGATGA CTCCCACGCA CCATCATCTT CTGATGTGGG TGTCAGCGTT 660 

ATCGTCAAGA ATGTTCGGAA CATTGACTCT TCCGAGGGCG GGGAGAAAGA CGGCCACAAC 720 

CCCACTGGCA ATGGCTTACA TAATGGGTTT CTCACAGCAT CCTCCCTTGA CAGTTACAGT 780 

AAAGATGGAG CAAAGTCCTT GAAAGGAGAT GTGCCTGCCT CTGAGGTGAC ACTCAAAGAC 840 

TCGACATTCA GCCAGTTTAG CCCGATCTCC AGTGCTGAAG AGTTTGATGA OGACGAGAAG 900 

ATTGAGGTGG ATGACCCCCC TGACAAGGAG GACATGCGAT CAAGCTTCAG GTCGAATGTG 960 

TTGACGGGGT CGGCTCCCCA GCAGGACTAC GATAAGCTGA AGGCACTCGG AGGGGAAAAC 1020 

TCCAGCAAAA- CTGGACTCTC TACGTCAGGC AATGTGGAGA AAAACAAAGC TGTTAAGAGA 1080 

GAAACAGAAG CCAGTTCTAT AAACCTGAGT GTTTATGAAC CTTTTAAAGT CAGAAAAGCA 1140 

GAGGATAAAT TGAAGGAAAG CTCTGACAAG GTGCTGGAAA ACAGAGTCCT AGATGGGAAG 1200 

CTCAGCTCCG AGAAGAATGA CACCAGOCTC CCCAGCGTTG CGCCATCAAA GACAAAGTCG 1260 

TCCTCCAAGC TCTCGTCCTG CATCGCTGCC ATOGCGGCTC TCAGCGCTAA AAAGGCGGCT 1320 

TCAGACTCCT GCAAAGAACC AGTGGCCAAT TCGAGGGAAT CCTCCCCGTT ACCAAAAGAA 1380 

GTAAATGACA GTCCGAGAGC CGCTGACAAG TCTCCTGAAT CCCAGAATCT CATCGACGGG 1440 

ACCARAAAAC CATCCCTQAA GCAACCGGAT AGTCCCAGAA GCATCTCAAQ TGAGAACAGC 1500 

AGCAAAGGAT CCCCGTCCTC TCCCGCAGGG TCCACACCAO CAA1CCCCAA AGTCCGCATA 1560 

AAAACCATTA AGACATCTTC TGGGGAAATC AAGAGAACAG TGACCAGGGT ATTGCCAGAA 1620 

GTGGATCTTG ACTCTGGAAA GAAACCTTCC GAGCAGACAG CGTCCGTGAT GGCCTCTGTO 1680 

ACATCOCTTC TGTCCTCTCC AGCATCAGCC GCCGTCCTTT CCTCTCCCCC CAGGGCGCCT 1740 

CTCCAGTCTG CGGTOGTGAC CAATGCAGTT TCCCCTGCAG AGCTCACCCC CAAACAOGTC 1800 

ACAATCAAGC CTGTGGCTAC TGCTTTCCTC CCAGTGTCTG CTGTGAAGAC GGCAGGATCC 1860 

CAAGTCATTA ATTTGAAGCT CGCTAACAAC ACCACGGTGA AAGCCACGGT CATATCTGCT 1920 

GCCTCTGTCC AGAGTGCCAG CAGCGCCATC ATTAAAGCTG CCAACGCCAT CCAGCAGCAA 1980 

ACTGTCGTQG TGCCGGCATC CAGCCTGGCC AATGCCAAAC TCGTGCCAAA GACTGTGCAC 2040 

CTTGCCAACC TTAACCTTTT GCCTCAGGGT GCCCAGGCCA CCTCTGAACT CCGCCAAGTG 2100 

CTAACCAAAC CTCAGCAACA AATAAAGCAG GCAATAATCA ATGCAGCAGC CTCGCAACCC 2160 

CCCAAAAAGG TGTCTCGAGT CCAGGTGGTG TCGTCCTTGC AGAGTTCTGT GGTGGAAGCT 2220 

TTCAACAAGG TGCTGAGCAG TGTCAATCCA GTCOCTGTTT ACATCCCAAA CCTCAGTCCT 2280 

CCCGCCAATC CAGGGATCAC GTTACCGACG CGTGGGTACA AGTGCTTGGA GTGTGGGGAC 2340 

TCCTTTGCAC TTGAAAAGAG TCTGACCCAG CACTACGACA GACGGAGOGT GCGCATCGAA 2400 

GTAACGTGCA ACCATTGTAC AAAGAACCTC GTTTTTTACA ACAAATGCAG CCTCCITTCC 2460 

CATGOOCGTG GGCATAAGGA GAAAGGGGTG GTAATGCAAT GCTCCCACTT AATTTTAAAG 2520 

CCAGTCCCAG CAGATCAAAT GATAGTTTCT CCGTCAAGCA ATACTTCCAC TTCAACTTCC 2580 

ACTCTTCAGA GCCCTGTGGG AGCTGGCACA CAGACTGTCA CAAAAATTCA GTCTGGCATA 2640 

ACTGGGACAG TCATATCGGC TCCTTCAAGC ACTCCCATCA CCCCAGCCAT GCCCCTAGAT 2700 

GAAGACCCCT CCAAACTGTG TAGACATAGT CTAAAATGTT TGGAGTGTAA TGAAGTCTTC 2760 

CAGGACGAGA CATCACTGGC TACACATTTC CAGCAGGCTG CAGATACGAG TGGACAAAAG 2820 
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ACTTCCACTA TCTGCCAJ3AT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAQAGA 2880 

ATCCATCACC ACAAATCTCC CTACACCTGC CCTGAGTGTG GGGCCATCTG CAGGTCGGTG 2940 

CACTTOCAGA CCCACGTCAC CAAGAACTGT CTCCACTACA CGAGGAGAGT TGGTTTTCGA 3000 

TGTGTOCATT GCAATGTTGT GTACTCTOAT GTGOCTGCTC TGAAGTCTCA CATTCAA6CT 3060 

TCTCACTGTG AAG T CTT CTA CAAGTQTCCT ATTTGTCCAA TGGCGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AQATAGGAGA ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CTGTTCTCTT 3300 

TEATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

AGTATTGAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TCAATGGGAA A8AGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT GGCCAGTOCT 3540 

GGGTGGACGT GT7GGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGGAAGG AGCACGGGAA GCAAATGAAG AAACACCGCT GCCGCCAGTG TGACAAGTCT 3660 

TTCAGCTCGT OCCACAGCCT GTGOOGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTGTACGCCT GCTCGCACTG CCCAGACTCC AGACGTAOCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACOCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

AAAAAGCTGA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC OCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CCGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 4140 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AAOCAACAGG AGAACAAACC CAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

0G6ACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAA ATAGC CA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAAXA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTCCAG TAXAATAGAG TTAACAGTAC TGTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AATGTACCTT CCTTCACCTC GTCGTATATA TCCTCGATAA GTATTAAAAC AGTATTTGAG 4560 

TTTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGtGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

TOTTTCTTTA AAACAGAGTT CTTAGTAACA GGGGCAOTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTXAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4860 

TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GCCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGITTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTATTCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGCCTTT AAGAGCACAA 5160 

GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

ATAGATTTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GA1GCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTGTATTGC 5460 

AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGOCATT TCAAGAAGTA 5520 

tATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACICTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTCTACCTT TTTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



SEP ID HO:255 PBJ8 P rotein seouencg 
Protein Accession* BAB13455 

MKTPDFDDLL AAFDIPDMVD PKAAESGHD DHESHMKQNA HGEDDSHAPS SSD VGVSVIV 60 
KNVRNTDSSE GGEKDOHNFT GNGLHNGFLT ASSLDSYSKD GAKSLKGDVP ASEVTLKDST 120 
FSQFSPISS A EEFDDDEKIE VDDFFDKEDM RSSFRSNVLT GSAPQQDYDK LKALGGENSS 180 
KTOLSTSGNV EKNKAVKRET EASSWLS VY EPFKVRKAED KLXESSDKVL ENRVLDGKLS 240 
SEKNDTSLPS VAPSKTKSSS KLSSCIAAIA ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
D3PRAADKSP ESQNUDGTK KPSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKT 360 
IKTSSGEIKR TVTRVLPEVD LDSGKKPSEQ TASVMASVTS LLSSPASAAV LSSPPRAFLQ 420 
SAWTNAVSP AELTPKQVTI KPVATAFLPV SAVKTAGSQV INLKLANNTT VKATVISAAS 480 
VQSASSAUK AANAIQQQTV WPASSLANA KLVPKTVHLA NLNLLPQOAQ ATSELRQVLT 540 
KPQQQKQAI INAAASQPPK KVSRVQWSS LQSSWEAFN KVLSSVNPVP VYIPNLSPPA 600 
NAGITLPTRG YKCLECGDSF ALEKSLTQHY DRRSVREVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGWM QCSHLILKPV PADQMIVSPS SNTSTSTSTL QSPVGAGTHT VTKIQSGITG 720 
TVISAPSSTP nPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHFQQ AADTSGQKTC 780 
TICQMLLPNQ CSYASHQRM QHKSPYTCPE CGAICRSVHF QTHVTKNCLH YTRRVGFRCV 840 
HCNWYSDVA ALKSHIQGSH CEVFYKCPiC PMAFKSAPST HSHAYTQHPG KIGEPKIIY 900 
KCSMCDTVFT UQTLLYRHFD QHIENQKVS V FKCPDCSLLY AQKQLMMDHI KSMHGTLKSI 960 
EGPFNLGINL PLSIKPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPGW 1020 
TCWECDCLFM QRDVYISHVR KEHGKQMKKH PCRQCDKSFS SSHSLCRHNR IKHKGIRKVY 1080 
ACSHCPDSRR TFIKRLMLEK HVQLMHGiKD PDLKEMTDAT NEEETEKED TKVPSPKRKL 1140 
EEPVLEFRPP RGAITQPLKK LKINVFKVHK CAVCGFTTEN LLQFHEHIPQ HKSDGSSYQC 1200 
RECGLCYTSH VSLSRHLFIV HKLKEPQPVS KONGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT HGMAFIKSKR MSSAEK 
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SEQIDH0355PBM10M sequence 
Nucleic Add Accession* AF111W7 

Coding sequence: 58-1608 (undeitned sequence corresponds to sbrtand stop codon) 

1 11 21 31 41 51 

I I I I.I I 

TTTTCQTCGA CTCTTAOCGG TTGGCTGGGC CAGCTGCOCC GCGGCTCACA GCTGACGATG 60 

GGGGACCCCA GCAAGCAGGA CATCTTGACC ATCTTCAAGC GCCTCOGCTC GGTGCCCACT 120 

AACAAGGTGT GTTTTGATTG TGGTGCCAAA AATCCCAGCT GGGCAAGCAT AACCTATGGA 180 

GTGTTCCTTT GCATW3ATTQ CTCAGGGTCC CACCGGTCAC TTGGTGTTCA CTTGAGTTTT 240 

ATTCGATCTA CAGAGTTGGA TTCCAACTGG TCATGGTTTC AGTTGCGATG CATGCAAGTC 300 

GGAGGAAACG CTAGTGCATC TTC C TTTT T T CATCAACATG OOTOTTCCAC CAATGACACC 3S0 

AATGCCAAGT ACAACAGTCG TGCTGCTCAG CTCTATAOGG AGAAAATCAA ATCGCTCGCC 420 

TCTCAAGCAA CACGGAAGCA TGGCACTGAT CTGTGGCTTG ATAGTTGTGT GGTTCCACCT 480 

TTOTCCCCTC CACCAAAGGA GGAAGATTTT TTTOCCTCTC ACGTTTCTCC TGAGGTGAGT 540 

GACACAGCGT GGGCATCAGC AATAGCAGAA CCATCTTCTT TAACATCAAG GCCTOTGGAA 600 

AOCACTTTGG AAAATAATGA AQGTGGACAA GAGCAAGGAC CAAGTGTGGA AGGTCTTAAT 660 

GTACCAACAA AGGCTACTTT AGAGGTATCC TCTATCATAA AAAAGAAACC AAATCAAGCT 720 

AAAAAAGGCC TTGGGGCCAA AAAAGGAMJT TTGGGAGCTC AGAAACTGGC AAACACAXGC 780 

TTTAATCAAA TTGAAAAACA AGCTCAAGCT GCGGATAAAA TGAAGGAGCA GGAAGACCTG 840 

GOCAAGGTGG TATCTAAftGA AGAATCAATT GTTTCATCAT TACGATTAGC CTATAAGGAT 900 

CTTGAAATTC AAAT6AAGAA AGACGAAAAG ATGAACATTA GTGGCAAAAA AAATGTTGAC 960 

TCAGACAGAC TCGGCAT6GG ATTTGGAAAT TGCAGAAGTG TTATTTCACA TTCAGTGACT 1020 

TCAGATATGC AGACCATAGA GCAGGAATCA CCCATTATGG CAAAACCAAG AAAAAAGTAT 1080 

AATGATGACA GTGACGATTC ATATTTTACT TCCAGCTCAA GTTACTTTGA CGAGCCAGTG 1140 

GA6TXAAGGA GCAGTTCTTT CTCTACCTGG CATGACACTT CAGATTCCTA TTGGAAAAAA 1200 

GAGACCAGCA AAGATACTGA AACAGTTOT G AAAACCACAS GCTATTCAGA CAGACCTACT 1260 

GCTCGCCGCA AGOCAGATTA TGAGCCAGTT GAAAATACAG ATGAGGCCCA GAAGAAGTTT 1320 

GGCAATGTCA AGGCCATTTC ATCAGATATG TATTTTGGAA GACAATCCCA GGCTGATTAT 1380 

GAGACCAGGG CCCGCCTAGA GAGGCTGTCG GCAAGTTCCT CCAXAAGCTC GGCTGATCTG 1440 

TTCGAGGAGC CGAGGAAGCA GCCAGCAGGG AACTACAGOC • TGTOCAGTGT GCTGCCCAAC 1500 

GCCCCCGACA TGGCGCAGTT CAAGCAGGGA GTGAGATCGG TTGCTGGAAA ACTCTGCGTC 1560 

TTTGCTAATG GAGTCGTGAC TTCAATTCAG GATCGCTACG GTTCTTAATA CTGAAGTCAT 1620 

GATGTGTATT TCCTGGAGAA ATTCCTCTTT AAATGAACAA GTAACCACAT CTCAGGCGGC 1680 

AGTGAAGTCC AGATAGTTTT GCAGATTGTT TTGCTACTTT TTCATATGGT ATATGTTTCT 1740 

GATTTTTAAT ATTTCTTTTG AGAAATTCTG AGTTCTGATG TAGGAGCTTT CCTGTGATTT 1800 

CTGTTTCACG TTCCTTCCTG TCACACCCTC CTTTGGOGTC TCTGTGTATA TCCTTGCTTT 1860 

ATTTTCTTGG AACCTTTGAT TTCAACACTG AGGGCCTGGA GACCTCGGCT CCTCCTGCTC 1920 

CTGAACCAGG AGGCTTCATG TGGGGGAGGA GGAGAGGTCT CCATGTGACA CATGGGCTCA 1980 

GGGCTGCCAG AATCAGCGGA TGCTGGATGG GCCTGCAGAA ACAACACTCA CCACACACAC 2040 

TTCCTTCAAA AGACCAAAAG TGACTGGTGT CTCGTCTGAC AGATTGCTTC ATTTATGTTT 2100 

CTACATAGTA AGGTGACTGC CAAATAATAT TTGAAGTCAT CTGTCTCTTT GTAAATTATT 2160 

TTATATGACC TATAAATTTA AAAATOTTTT TCAGTGAGTG CTTTTAACAA ACTTAAGCTT 2220 

CTGCOCTGCC AAGGGAATTA ATGTTATCTT GTGAAAGGTG TTGCTGTTTC AATTGATGAG 2280 

AAATGGAAGA TGAGAACTCC CTAAGAGTTC TCATAATAAA TCATCTCATC ACAAATCAAT 2340 

ACGGTATACA GAGTIAAAST GGAATGAGGT AAGAAGAXAC AGCTACAGAA AATASTTGCG 2400 

TGTATGGGAG AACAGTCATT GTAATTGGGT AGTTTTGTTA ATAAATATTT TTAAATCTTG 2460 

CTTTTCAGAA ATTACCGAAT GTGTATAAAC AAATAAAGAA AAAXAATTTA GCTGTGTTTT 2520 

AGACAGCATT AGAATATATT GTTCAGCACA GTAAAATATA TTTGAAATTT GATAAGCCAA 2580 

AAAtGTGGTT TTGAATGAAT ATTTTGTGAA TCTTTCTTAA AAGCTCAAAT TTGTAGACTT 2640 

CTAAATAGAA TAAACACTTG CAGCAGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2700 

AAABAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2760 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



SEP ID WO.-257 PBM1 Proleln sequence; 
PBM1 Protein sequence CAB76S01 

MGDPSKQDO. TIFKRLRSVP TNKVCFDCGA KNPSWASITY GVFLQDCSG SHRSLGVHLS 60 
FKSTCLDSN WSWFQLRCMQ VGGNASASSF FHQHGCSTND TNAKYNSRAA QLYKEKKSL 120 
ASQATRKHGT DLWLDSCWP PLSPPPKEED FFASHVSPEV SDTAWAS A1A EPSSLTSRPV 180 
ETTLENNEGG QEQGPSVEGL NVPTKATLEV SSIUCKKPNQ AKKGLGAKKG SLGAQKLANT 240 
CFNEEKQAQ AADKMKEQED LAKWSKEES IVSSLRLAYK DLHQMKKDE KMNISGKKNV 300 
DSDR1GMGFG NCRSVISHS V TSDMQ TIEQE SP1MAKPRKK YNDDSDDSYF TSSS5YFDEP 360 
VELRSSSFSS WDDSSDS YWK KETSKDTETV LKTTGYSDRP TARRKPDYEP VENTDEAQKK 420 
FGNVKABSD MYPGROSQAD YETRARLERL SASSStSSAD LFEEPRKQPA GNYSLSSVLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANGWTSI QDRYGS 



SEQ ID NO358PBM40NA sentience 
Nucleic Add Accession!: D30891 

Coding sequence: 1-4032 (undeitned sequence corresponds to start and stop codon) 

AJGGATACTG TCATGAAGCA GACA CATGCT GACACACCTG TTOATCATTO TCTATCTGGC 60 
ATAAGAAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTGAAG TCAACAAGCA TGAAACAGCC 120 
CTTGAAATGC AG AATCCAAA TTTGAACAAT AAAG AATGTT GTTTCACCTr TACGTTG AAT 180 
GGAAACTCCA GAAAATTAGA CCGTAGTGTG TTTACAGCAT ATGGTAAACC CAGCGAGAGT 240 
ATCTACTCAG CCCTGAGTGC TAATGACTAT TTCAGTGAAA GGATAAAGAA TCAGTTTAAT 300 
AAGAACATTA TTGTTTATGA AGAAA AGAC A ATAGATGGAC ATATAAATTT AGGAATGCCT 360 
CTCAAGTGCC TGCCTAGTG A TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAGAGTAGC 420 
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AAAGAAGATG GACACATATT ACGCCAATGT C AAAATOCAA ACATGGAATO CAJTCTTI II 480 
CAIGUlil 1U CTATAGG AAG GACAAGAAAG AAO ATTGTTA AGATCAACGA ACTTCATGAA 340 
AAAGGAAGTA AACTTTGTAT TTATGCCTTG AAGGGTOAOA CTATTGAAGG AGCCTTATGC 600 
AAGGATGGCC GTTTTCGGTC TOACATAGGT GAATTTGAAT CGAAACTAAA GG AAGGTCAT 660 
5 AAGAAAATTT ATGG AAAACA GTCCATGGTG GATGAAGTAT CTGGAAAAGT CTTAGAAATG 720 
GACATTTCAA AAAAAAAAGC ATTACAACAG AAAGATATCC ATAAAAAAAT TAAACAGAAT 780 
GAAAGTGCCA CTGATGAAAT TAATCACCAG AGTCTOATAC AGTCTAAGAA AAAAOTCCAC 840 
AAACCAAAGA AAGATGGAGA GACCAAAGAT GTAGAACACA GCAGAGAGCA AATICTOCCA 900 
CCTCAGGATC TAAGCCATTA TATTAAAGAT AAAACTCGCC AGACAATTCC CAGGATTAGA 960 
10 AATTATTACT TTTGTAGTTT GCCCCGAAAA TATAGGCAAA TAAACTCACA AGTTAGACGG 1020 
AGGOCGCATC TGGOTAGGCG GTATGCTATT AATCTGGATO TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAOA ATTATCAAAC GTTGAATGAA GCCATAATGC ATCAGTATOC GAATTTTAAA 1 140 
GAGGAGGCAC AGTGGGTAAO AAAATATTTT CGGGAAGAAC AAAAGAGAAT GAATCTTTCA 1200 
CCAGCTAAGC AATTCAACAT ATATAAAAAG GACTTCGGAA AAATGACTGC AAATTCTGTT 1260 
15 TCAGTTOCAA OCTOOGAACA GCTTACATAT TATAGCAAGT CAGTTOGGTT CATG CAATG G 1320 
GACAATAATG GAAACACAGG TAATGCTACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1380 
ACCTGTCGAC ATGTTGTACA TCTTATGGTG GGTAAAAACA CACATCCAAG TTTGTGGCCA 1440 
GATATAATTA GCAAATGTGC GAAGGTAACC TTCACTTATA CAGAGl 1CTG CCCTACTCCT 1500 
OACAATTGGT TTTCCATTGA GCCATOGCTT AAAGTGTOCA ATGAAAATCT AOATTATGCC 1560 
20 ATTTTAAAAC TAAAAGAAAA TGGAAATGCG TTTCCTCCAG GACTATGGOG ACAGATTTCT 1620 
CCICAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAGATGGTT GTACTCTGAT TCCTCTAAAC GAACGATTGA AAAAATATCC AAACOATTGT 1740 
CAAG ATGGGT TGOTAGATCT CTATOATACC ACCAGT AATO TATACTGTAT OTTTACCCAA 1800 
AGAAGTTTCC TATCAGAGGT TTGGAACACA CACACGCITA GTTATG ATAC TTCTTTCTCT 1860 
25 GATGGGTCCT CAGGCTCCCC AGTGTTTAAT GCATCTGGCA AATTGGTTGC TTTGCATACC 1920 
TTTGGG CT1 1 1 1 1 A TCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1980 
ATGOATTCTA TTCTTTGTGA TATTAAAAAG ACAAATGAGA GCTTGTATAA ATCATTAAAT 2040 
GATGAGAAAC TTGAGACCTA OGATGAAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGOGA 2100 
CTAGGATGCT TTOGCTTTCG CICTOGCTTT CCAATACTCG GGACTGGGGA AACCGGGAGA 2160 
30 ATAGAAGCAG GCAAGGACCG CGGTGGGCAC GGGGTCAGTG AGACAGGGTC CIGCTCGCGG 2220 
CGTCAAGGAG GAGCGCTGTG GGTGTCCCCA GCGCAGOCAA TCGGCTTCCG AAGTAGCTGG 2280 
AGCTCTGGAG OCTTTGCTTC CTCAAATACG AGCGGGAACT GCGTTGAGCG CTGGATTOCA 2340 
GGCCGAGTGC TGGCGAGGCG CGCAGTCTCT AAAGAGCAAC AGAATAATTG CAGTACTTCI 2400 
CTAATGAGGA TGGAGTCTAG AGGAGACCCA AGAGCCACAA CTAATACCCA GGCTCAAAGA 2460 
35 TTCCATTCAC CTAAG AAAAA TCCAGAAGAC CAGACCATGC COCAAAATAG GACAATATAT 2520 
GTTACCTTGA AGGCTGTCAG AAAAGAGATA GAAACTCACC AAGGOCAAGA A ATGCTT GTG 2580 
CGTGGCACAG AAGGAATCAA AGAGTACATA AACCTTGGAA TGCCCCTCAG TTGTTTCCCT 2640 
G AAGGTGGCC AGGTGGTCAT TACATTTTCC CAAAGTAAAA GTAAG CAGA A GGAAGATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTQ TCAAATTTTA CATTCATGCA 2760 
40 ATTGGAATTG GGAAGTGTAA AAGAAGGATT GTTAAATGTG GG AAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTCTTTATGC TTTCAAAGGA GAAACCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTCCTTTCT GGAGAATGAT GATIGGAAAC TCATTGAAAA CAATGACACC 2940 
ATTTTAGAAA GCACCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAGAAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAGAATC CTG AGTCAGA GAAAAGAAAC 3060 
45 ACCTGTGTGT TGAG AGAACA AATCGTGGCT CAGTACCCCA GTTTGAAAAG AGAAAGTGAA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGTAAAAA ATGGGGAAAC ATTATTTGAA 3180 
TTGCATAGAA CAACGTTTGG GAAAGTAACA AAAAATTCTT CTTCGATTAA AGTAGTGAAA 3240 
CTTCTTOTAC GTCTCAGTGA CICAGTTGGG TACTTATTCT GGGACAGTGC AACTAOGGGT 3300 
TAOGCCACCT G CTTKiTm TAAAGGATTG TTCATTTTAA CTTGTCGGCA TGTAATAGAT 3360 
50 AGCATTGTGG GAGACGGAAT AGAGCCAAGT AAGTGGGCAA CCATAATTGG TCAA TGTGTA 3420 
AGGGTQACAT TTGGTTATOA AGAGCTAAAA GACAAGGAAA CAAACTACTT TTTTGTTGAA 3480 
CCTTGGTTTC AGATACATAA TGAAGAGCTT GACTATGCTG TCCTGAAACT GAAGGA AAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 
ATACATATTA TICGOCATCC ATATGGAG AA AAAAAGCAG A TTG ATGCTTG TGCTGTG ATC 3660 
55 CCTCAGGGTC AGCGAGCAAA GAAATGTCAG GAACGTGTTC AGTCT AAAAA AGCAGAAAGT 3720 
CCAGAGTATG TCCATATGTA TACTCAAAGA AGTTTCCAGA AAATAGTTCA CAA CCCTGAT 3780 
GTGATTACCT ATGACACTGA ATTTTTCTTT GGGGCTTCCG GCTCCCCTGT GTTTGATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATG AG ACT 3900 
CGTAGTATCA TIGAGTTTGG CTCTACCATG GAATCCATCC TCCTTGATAT TAAGCAAAGA 3960 
60 CATAAACCAT GGTATGAAGA AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTGAT 4020 
GAGG ACTTGXGA,G AATTCAG TCTACTGG AT TTAAGGGAAT GGCTTATGG A GTTGTTATTr 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TCCTATCTGC CAOGCATTTT TCTAAGCACA TCAAG AAATT AGTCCTAACA 4200 
ACACTATGAG ATGGACTATA ACTTGCCCAA All 1 1111 II TTTTTGAGAC TGAGTCTCAC 4260 
65 TCTGTCGCCT GGGCTGGAGT ACAGTGGTGC GATCTCAGCT CACTGCAACT TCCACCTCCC 4320 
AGGTTCAAGC GATTCTTATO CCTCAGTCTC CTGAGCAGCT GGGATTACAO GCAAACGCCA 4380 
CCACACOCAG CTAAATTTTT TTTTTTTTn TGTATTTTTA GTAGAGACAG GGTTTCACCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTGATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGGAT TACAAGTTTG AGCCACTGCA CCTGGCTAAC TTGCCCTATT TTAAAGTCAA 4560 
70 GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCTAAATT CATTTGCTAC AGTGCAGG AA 4680 
CCAAAACTTG TTCATCTCAT GATTCCCTAC ATCTGACATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCAATAAGT TGCAAAAGTT GGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTGA TTAGAAATGA TCTCAAAACC TTTTAGAATT TCCAAAATCT TCATATTACT 4860 
75 GAAACTGTCG GAATATATGG GTCCTGAAAT TCAGAAGATG ATAGTC ACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG G ATATCTTAA ACATCATATT ACTTTATTTA GATTTCTACT 4980 
ACTCCAATTA TTAATGTTAT GTATTTCTCA TTGTTTTACT TCTTCATGGT ATTATGAAGA 5040 
CTATATAGAT GATTCAACCA AGCCTGCAAA TCTCCCTCTT GTGGAATTCC ACTGGACCCA 3100 
ATCTGTTTTC CATTTOCATT GCAATACTAC TAAAGCCATA CAATATCAAG CACCCTCCCT 5160 
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CTAGGTOCAG GGACTATCAC AGAAGAAGCA GGCATGTAAO ATTTTAAGGA CTOGTTTCGA 5220 
CGGOTCOAGT OTAGOAAAAC AOCCTOTTOC ATTOTAAGAO TCATGTCACCfTO AAGAGC A 5280 
GCTGGCATGA TGACTGCTGT TTGACTCCTG CATAOCAAGA TATTCTGCAG CAATGTCTTT 5340 
AAACAGTGOC GOTAOTACAG ATAACCCCTC ATAAAGATGC TT ATCTA ACC TCCCCAOTOT 5400 
TCAGGTGTTT CACAAG AAAG TCTGAGATAT GACTAGCTAC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTACTTTT GGGAGGGTOA GTGOCGOCAT TTAGTGGCTG CTAG AAACAT 5520 
TG CTTCTOTT TGTAAGTTCC TATTAAATGT TCTTTCTGAG AAAAAAAAAA A 



SEQ ID Nftgg PBM4 Protein sequence: 
PBJMPraleln sequence: BAB6778B 

MDTVMKQTHA DTPVDHCLSG RKCSSTFKL KSEVNKHBTA LEMQNPNLNN KECOTF1LN 60 
GNSRKLDRSV FTAYGKPSES IYSALSANDY FSERKNQFN KNIIVYEEKT IDGH1NLGMP 120 
UCCLPSDSHF KITFGQRKSS KEDGULRQC ENPNMECILF HWAIGRTRK KIVKINELHB 180 
KGSKLCIYAL KGETIEGAIjC KDGRFRSDIG EFEWKLKEGH KHYGKQSMV DEVSGKVLEM 240 
DISKKKALQQ KDHKKIKQN ESATDEINHQ SUQSKKKVH KPKKDGETKD VEHSREQILP 300 
PQDLSHYIKD KTRQUPRIR NYYFCSLPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAIN 360 
LLKNYQTLNE AIMHQYFNFK EEAQWVRKYF REEQKRMNLS PAKQFNIYKK DFGKMTANSV 420 
SVATCEQLTY YSKSVGFMQW DNNGNTGNAT CFVFNGGYIF TCRHWHLMV GKNTHPSLWP 480 
DI1SKCAKVT FTYTEFCPTP DNWFSEPWL KVSNENtPYA ILKLKENGNA FPPGLW RQg 540 
PQPSTGLIYL IGHPEGQDCK IDGCTVIPtM ERLKKYPNDC QDGLVDLYDT TSNVYCMFTQ 600 
RSFLSEVWNT HTLSYDTCFS DGSSGSPVFN ASGKLVALHT FGLFYQRGFN VHAIiEFGYS 660 
MDSIIjCDIKK TNESLYKSLN DEKLETYDEE KARPRPAYRR LGCFRFRSRF PUjG TGETGR 720 
EAGKDRRGH GVSETGSCSR RQGGALWVSP AQPIGFRSSW SSGAFASSNT SGNCVERWIP 780 
GRVLARRAVS KEQQKNCSTS LMRMESRGDP RATTNTQAQR FHSPKKNPED QJMPQNRTIY 840 
VTLKAVRKEI ETHQGQEMLV RGTEGDCBY1 NLGMPLSCTP EGGQWITFS QSKSKQKEDN 900 
HIFGRQDKAS TECVKFYIHA IGIGKCKRPJ VKCGKLHKKG RKLCVYAFKG ETKDALCKD 960 
GRFLSFLEND DWKLffiNNDT ILESTQPVDE UEGRYFQVEV EKRMVPSAAA SQNPESEKRN 1020 
TCVLREQIVA QYPSLKRESE KIIENFKKKM KVKNGETLFB LHRTTFGKVT KNSSSDCWK 1080 
LLVRLSDSVG YLFWDSATTG YATCFVFKGL FILTCRHVID STVGDGIEPS KWATUGQCV 1140 
RVTFGYEELK DKETNYFFVE PWFHHNEEL DYAVUXKEN GQQVPMELYN GITPVPLSGL 1200 
MUGHPYGE KKQD3ACAVI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKIVHNFD 1260 
VTIYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSOEFGSTM ESILLDDCQR 1320 
HKPWYEEVFV NQQDVEMMSD EDL 



SEP ID NO:260 PBQ1 DMA sequence 
Nucleic Add Accession*: NM_015642 

Coding sequence: 489-2489 (underDned sequence coiresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

ACATTTCAAA AAAAATACAT AGRCTGATGT TTCAGACTTG TGCAGCATAA GCCTACAGGG 60 

TACGAAGAAT GAACTCTGAQ AATGTTTGOA GAATGTTTCA TCATTACTAA CAGGATATTC 120 

CTCATGACAT TGCTGTCTGA TCTTT6ACCA TCAGTCTGTG ACCTGCCCCT TCTCTTTACA 180 

TOCW3CCGCT CTCTGCTCCC TGOOCCAATG RACATCTGCA CTAGGCCCAA GCCTTGGAGT 240 

AATTTACCTG AAGAGTGACA CCATTGATTT TGAAACTACT GAA6AAACCC AAGACAGCTG 300 

AAAACCAGAA GGCATCTGAG GAGAATGAGA TTACTCAGCC GGGTGGATCC AGCGCCAAGC 360 

CGGGCCTTCC CTGCCTOAAC TTTGAAGCTG TTTTGTCTCC AGACCCAGCC CTCATCCACT 420 

CAACACATTC ACTGACAAAC TCTCACGCTC ACACCGGGTC ATCTGATTGT GACATCAGTT 480 

CCAAGGG GAT G ACCGAGCGC ATTCACAGCA TCAACCTTCA CAACTTCAGC AATTCCGTGC 540 

TCGAGACCCT CAACGASCAG CGCAACCGTG GCCACTTCT6 T6AC6TAACQ GTGCGCATCC 600 

ACGGGAGCAT GCTGCGCGCA CACCGCTGCG TGCTGGCAGC CG6CAGCCCC TTCTTCCAGG 660 

ACAAACTGCT GCTTGGCTAC AGCGACATCG AGATCCCGTC GGTGGTGTCA GTGCAGTCAG 720 

TGCAAAAGCT CATTGACTTC ATGTACAGCG GCGTGCTACG GGTCTCGCAG TCGGAAGCTC 780 

TGCAGATCCT CACGGCCGCC AGCATCCTGC AGATCAAAAC AGTCATCGAC GAGTGCACGC 840 

GCATCGTGTC ACAGAACGTG GGCGATGTGT TCCCGGGGAT CCAGGACTCG GGCCAGGACA 900 

CGCCGCGGGG CACTCCCGAG TCAGGCACGT CAGGCCAGAG CAGCGACACG GAGTCGGGCT 960 

ACCTGCAGAG CCACCCACAO CACAGCGTGG ACAGGATCTA CTCGGCACTC TACGCGTGCT 1020 

CCATGCAGAA TGGCAGCGGC GAGCGCTCTT TTTACAGCGG GGCAGTGGTC AGCCACCACG 1080 

AGACTGCGCT CGGCCTGCCC CGCGACCACC ACATGGAAGA CCCCAGCTGG ATCACACGCA 1140 

TCCATGAGCG CTCGCAGCAG ATGGAGCGCT ACCTGTOCAC CACCCCCGAG ACCACGCACT 1200 

GCCGCAAGCA GCCCCGGCCT GTGCGCATCC AGACCCTAGT GGGCAACATC CACATCAAGC 1260 

AGGAGATGGA GGACGATTAC GACTACTACG GGCAGCAA&G GGTGCAGATC CTGGAACGCA 1320 

ACGAATCCGA GGAGTGCACG GAAGACACAG ACCAGGCCGA GGGCACCGAG AGTGAGCCCA 1380 

AAGGTGAAAG CTTCGACTCG GGCGTCAGCT CCTCCATAGG CACCGAGCCT GACTCGGTGG 1440 

AGCAGCAGTT TGGGCCTGGG GCGGCGCGGG ACAGCCAGGC TGAACCCACC CAACCCGAGC 1500 

AGGCTGCAGA AGCCCCCGCT GAGGGTGGTC CGCAGACAAA CCAGCTAGAA ACAGGTGCTT 1560 

CCTCTCCGGA GAGAAGCAAT GAAGTGGAGA TGGACAGCAC TGTTATCACT GTCAGCAACA 1620 

GCTCCOACAA GAGCGTCCTA CAACAGCCTT CGGTCAACAC GTCCATCGGG CAGCCATTGC 1680 

CAAGTACCCA GCTCTACTTA CGCCAGACAG AAACCCTCAC CAGCAACCTG AGGATGCCTC 1740 

TGACCTTGAC CAGCAACACG CAGGTCATTG GCACAGCTGG CAACACCTAC CTGCCAGCCC 1800 

TCTTCACTAC CCAGCCCGCG GGCAGTGGCC CCAAGCCTTT CCTCTTCAGC CTGCCACAGC 1860 

CCCTGGCAGG CCAGCAGACC CAGTTTGTGA CAGTGTCCCA GCCCGGTCTG TCGACCTTTA 1920 

CTGCACAGCT GCCAGCGCCA CAGCCCCTGG CCTCATCCGC AGGCCACAGC ACAGCCAGTG 1980 

GGCAAGGCGA AAAAAAGCCT TATGAGTGCA CTCTCTSCAA CAAGACTTTC ACCGCCAAAC 2040 

AGAACTACGT CAAGCACATG TTCCTACACA CAGGTGAGAA GCCCCACCAA TGCAGCATCT 2100 

GTTGGCGCTC CT TCTCCTTA AAGGATTACC TTATCAAGCA CATGGTGACA CACACAGGAG 2160 
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TOAGGGCATA CCAGTGTAGT ATCTGCAACA AGCGCTTCAC OCAGAAGAGC TCCCTCAACG 2220 
TGCACATGCG CCTCCACCGQ CGAGAGAAGT CCTACGAGTO CTACATCTGC AAAAAGAAGT 2280 
TCTCTCACAA GACCCTQCTG GAGCGACACG TGGC0CT6CA CAGTGCCAGC AATGGGACCC 2340 
_ CCCCTGCAOO CACACCCCCA GGTGCCCGCG CTGGCCCCCC AGGCGTGGTG GCCTGCACCG 2400 

5 AGGGGACCAC TTACGTCTGC TCCGTCTGCC CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

ACGACCACAT GAGGATGCAT GTGTCTGACG GATAAGTAOT ATCTTTCTCT CTTTCTTATG 2520 
AACAAAACAA AACAACAACA AAAAACAAAC AAACAAAAAA GCTAXGGCAC TAGAATTTAA 2580 
GAAATGTTTT GGTTTCATTT TTACTTTCTG TTTTTGTTTT TGTTTCGTTT CATTTTGTAC 2640 
. TACATGAAGA ACTGTTTTTT GCCTGCTGGT ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

10 ATAGTTTTCC CAGTCTCCCT CGGATGGTGG CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC- 2760 

CACTGGTTGG ATCTCTAGCT ACTGGCCTCT AAATACAACC CTTCITTACA AAAAAAAAAA 2820 
AAAAAAAAA 

15 SEQ ID NQ361 PB Q1 Pnileln seouencK 

PBQ1 Protein sequence NP_056457 

MTERIHSINL HNFSNSVLET LNEQRNRGHF CD VTVRIHGS MLRAHRCVLA AGSPFFQDKL 60 
LLGYSDEIP SWSVQSVQK UDFMYSGVL RVSQSEALQI LTAASHQIK TVEECTRIV 120 
20 5QNVGDVFPG 1QDSGQDTPR GTPESGTSGQ SSDTESOYLQ SHPQHSVDRI YSALYACSMQ 180 
NGSGERSFYS GAWSHHETA LGLPRDHHME DPSWTTRIHE RSQQMERYLS TTPETTHCRK 240 
QPRPVRIQTL VONIMKQEM EDDYDYYGQQ RVQILERNES EECTEDTOQA EGTESEPKGE 300 
SFDSGVSSSI GTEPDSVEQQ FGPGAARDSQ AEPTQPEQAA EAPAEGGPQT NQUCTGASSP 360 
ERSNEVEMDS TVTTVSNSSD KSVLQQPSVN TSIGQPLPST QLYLRQTETL TSNL RMPLT L 420 
25 TSNTQVIGTA GNTYLPALFT TQPAGSGPKP FLFSLPQPLA GQQTQFVTVS QPQLSTFTAQ 480 

LPAPQPLASS AGHSTASGQG EKKPYECTLC NKTFTAKQNY VKHMFVHK5B KPHQCSICWR 340 
SFSLKDYUK HMVTHTGVRA YQCSICNKRF TQKSSLNVHM RLHRGEKSYB C YICKKKFSH 600 
KTLLERHVAL HSASNGTPPA GTPPGARAGP PG WACTEGT TYVCSVCPAX FDQIEQFNDH 660 
MRMHVSDG 



30 
35 



55 



SEQID MO: 2CTPBQ6DNA sequence 
KudeicAlldAceeSStol* AK54187 

Coding sequence: 1-812 (underlined sequence corresponds to stertsid stop codon) 



I 11 21 31 41 51 

ATGO TCGAAO AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGCC GTCGTTTCGA CCATATAAAA ACGACCTATG CGAACTGCGT 120 

40 CGGAAAACTC CCTCAGGATG TAAAACGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACACCTGA ATTAAAGGAA GACTCATGCA ACTTGTTTrC TG GCAA TGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATOT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

45 GGAGGTGAGG ATTCTTGIGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTGGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTACGGAG 720 

50 TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 780 

GAAAAGTCTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCAXAAAAAA GGCCAGACTT 840 

GGTCTCTGTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGAT AA 



SEP ID K0363 P B06 Protein sequence: 
ProtetnAccesstonl: NP.060170 



MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 
60 NQKLQEKMTP QGECSVAETL TPEEEHHMKR MMAKREKIIK EUQTEKDYL NDLELCVREV 120 
VQPLRNKKTO RLDVDSLFSN ESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKOPL 180 
EDIYKIYCYH HDEAHSBUES YEKEEELKEH LSHCIQSLK 

65 SEQ ID NOi264 PB Y7 DNA sequence 
NudeteAddAecesstor* NMJM4323 

Coding sequence: 662-2725 (underlined sequence corresponds to start and stop cotton) 
1 u 21 31 41 51 

70 | | | I I I 

GGGCCTACTC TGCCGCCGCC GCCGCCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCACCGC 60 

CCTCCAGGCT CCGGGACCCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG CCGCCGCCOC 120 

CTTCGCCTTC GCCTTTTGTT TCCTCCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGCGGCCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

75 CGCGGCGGAC CCCTCCTTCT CCTCCCCGCG TGCGCGTGCC CTTCTTGGCT GCGCGOCGGC 300 

GCCGCCTGGC GGGCGGGAGG GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCCCCCGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTCGCGCGC AGTCCAGCGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTGATCCGGG CTGGGGGCGT GTACACTCGG CGCACCTGCG AGACEACAGA 540 

80 GCCICGGGCC GGCACGTCTG GGGAGTGTGQ ACACGTCTGC TGCGOCCCGC TTCTCGCTGC 600 
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40 
45 
50 
55 
60 
65 
70 
75 
80 



GTTTCAGATT 
GCCCATGGGG 
GGGACTGCTG 
AGA1TTTTAT 
TTCTCCCAAT 
ACTTGGTATG 
GTTTCTTTAA 
ATACCCAAAT 
TTCGTCATCC 
TACAATGCGG 
GAGTGTCCTC 
GAGCCCTGCT 
CCCACCCCAA 
TTCTCTAATT 
TGAAAGCTAT 
TATTAAACTT 
GGAAGAAATA 
TTTCAATGCT 
TCAGTTGTGT 
GACTGTATTA 



GGAGGGGGCG 
GTGAACGACG 
ACGGAGATGC 
CTCTTGOGGG 
GAOTACTTTG 
CCGGCTGATG 
CTGGAGATGC 
TCCCGCATCG 
CTOATGAGGT ' 
CTGGTACCCC 
GGCTTCCCTT 
GGCAGCATGC 
GCCTCTTTGC 
CCCCAACTGC 
AAGCGAGGCC 
GGGGGCCTGA 
GCCAACOGGC 
ATCGACCTTC 
GGCCCCCGAA 
TTCCGTGATG 
TCCTGCCCTG 
TCCCATGATG 
AGGCCTGATC 
TGTCAGACCT 
CATGAAGACA 
GACCACCTGA 
TTCTCCTC r e 
CAGGTCTCCA 
ACCTATGGCA 
TCCTATGGTG 
GGCTCTTTCT 
AAGTACCCAT 
ATCCAGAAGG 
GGCTCACCTT 
GTTCAGTCGG 
CCTGAAGGGA 
GGAAATGCTG 
TCATTTTTAA 
GGTCTTTAGA 
GGACAGGGGC 
TGGGAAGAAG 
CTATGATATT 
TCCCTTCCCA.1 
ATGCCCAACT 
CAAGACCCCC 
TGGAGGCGAG 
ATTTCAGTTC 
ATTATTATTA 
OCCAGGTGAT 
TGTTTAGATG 
GTTTTATGCA 
GTTGGGAACC 
CACATGTGAG 
AAAATGTTAG 



, GCGGCCGGGC 
! CCCGTCTGGC 
' GAACCAGCAG 
, GAGCTTCCCA 
1 CAGCGCCCAG 

GACGGCAGCA 

CTCCAAGGTA 
1 GGAGAGCTTT 
. GATCTGCCAG ■ 
: CGATATAATG 

CAACGGGGCA 
. GGAGGCAGCT 
: TGGGGTGOAC 
' ATTCCCCAGT 
' AAGGAAGGCC 
i CATCCTTCCA 
. CGAGGCCCAG 
I GCTGGGTGAG 

GACCAGGAAG 
' TAACCGOCAC 
1 GCGGTTC&AG 
! CAAGCCTTAC 
i ACATATCAAQ 
: TTTTGOCAOC 
I 0CA6GTGTGT 
! CGAGGGGCCC 
1 AAAGGTCCAT 
, OCCCATCCTG 
! CCAGAAATGC 
. TGCCAGCGAC 
' GQCAGTCCCC 
1 TGGGAGGTTC 
! GGCTCTCGGG 
, GCAGAACATG 
! ATCTTTAGTA 

CTGCTGTOTC 
, GGGAAGTSAT 
. ACCCCACTCC 

• CATCTGATAT 
' ACATAGGCCT 
: TGGTGCTCAA 
I AGTGATTTTG 
, AAGAACCACA 
I AAGCCAGAAG 

CCCTCTGCCT 

• GCTAGGACAA 
1 TTAACCATTC 
1 TTTTTAGGAC 
' TTGTAAACCG 
1 AACTTGGCTA 
. AAATGCCAGT 
! GACAGCCGGC 
! TTGACCTTGT 
! TA 



i GGGCGG06GC 
1 ACCAGGTGAG 
I GCGGGCGCTT 



SEQ ID W0:2S5 PBY7 Protein sequence: 
Protein Accession*: NP_1 14439 

MERVNDASCG PSGCYTYQVS RHSTEMLHNL NQQRKNGGRF CDVLLRVGDE SFPAHRAVLA 60 
ACSEYFESVF S AQLQDGQ AA DGGPADVGG A TAAPGGGAGG SRELEMHTTS S KVPGDILDF 120 
AYTSRIWRL ESFPELMTAA KFLLMRSVIE ICQEVUCQSN VQILVPPARA DMLFRFPGT 180 
SDLGFPLDMT NGAALAANSN GIAGSMQPEE EAARAAGAA1 AGQASLPVLP GVDRLPMVAO 240 
PLSPQIXTSP FPSVASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG ILPCOLCGKV 300 
FIDANRLRQH EAQHGVTSLQ LGYIDLPPPR LGENGLPBE DPDGPRKRSR TRKQVACEIC 360 
GKIFRDVYHL NRHKLSHSGE KPYSCPVCGL RFKRKDRMSY HVRSHDGSVG KPyiCQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYLRAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMAVPKNK MESDGEKKYP CPECGSFFRS KSYLNKHIQK VHVRALGGPL GDLGPALGSP 600 
FSPQQNMSLL ESPGFQIVQS AFASSLVDPE VDQQPMGPEG K 



SEQ ID NO:266 PBY9 DMA seouence 
Nudelc Add Accession!: NM.012429 

Coring sequence: 174-1385 (undefined sequence corresponds to start and slop eodon) 



41 



51 



1 11 21 31 

I I I I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 
GGGCAAAAGG CTGGGACTTT ACTCCGGGTG GCGGCGAGGA CGAGTCT6TG CTCCATCAGC 
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K3CCGCACCC GCCGCCTCCC GCCCCCAAAC CCCATCCCCG OGGTTGAGCC ACGATGAGCG 180 

GCAGAGTCGG CGATCTGAGC CCCAGGCAGA AGGACGCATT CJQGCAAQTTT CGGQ ACAATO 240 

TCCAGGATGT OCTGCOGGCC CTGCCGAATC CAGATGACTA TTTTCTCCTG CGTTGGCTOC 300 

GAGCCAGAAG CTTCGACCTG CAGAAGTOGG AGGCCATGCT CCGSAAGCAT GTGGAGTTCC 360 

GAAAGCAAAA GGACATTGAC AACATCATTA GCTGGCAGCC TCCAGAGGTO ATOCAACAGT 420 

ATCTGTCAGQ GGGTATGTGT GGCTATGAOC TGGATGGCTG CCCAGTCTGG TACGACATAA 480 

TTGGAOCTCT CGATGCCAAG GGTCTGCTGT TCTCAGOCTC CAAACAGOAC CTGCTGAGGA 540 

CCAAGATGCG GGAGTGTGAG CTGCTTGTGC AAGAGTGTGC CCACCAGACC ACAAAGTTGG 600 

GGAGGAAGGT GGAGACCATC ACCATAATTT ATGACTGCGA GGGGCTTGQC CTCAAGCATC 660 

TCTGGAAGCC TGCTGTGGAG GCCTATGGAG AGTTTCTCTG CAOGTtTGAG GAAAATTATC 720 

CCGAAACACT CAAGCGTCTT TTTGTTGTTA AAGCOCCCAA ACTGTTTCCT GTGGCCTATA 780 

ACCTCATCAA AGOCTTCCTG AGTGAOGACA CTCGTAAGAA GATCATGGTC CTGGGAGCAA ' 840 

ATTGGAAGGA GGTTTTACTG AAACATATCA GCCCTGAOCA GGTGCCTGTG GAGTATGGGG 900 

' GCACCATGAC TGACCCTGAT GGAAACCCCA AGTGCAAATC CAAGATCAAC TACGGGGGTG 960 

ACASCCCCAG GAAGTATTAT GTGCGAGACC AGGTGAAACA GCAGTATGAA CACAGCGTGC 1020 

AGATTTCCCG TGGCTCCTCC CACCAAGTGG AGTATGAGAT CCTCnCCCT GGCTGTGTCC 1080 

TCAGGTGGCA GTTTATGTCA GATGGAGCGG ATGTTGGTTT TGGGATTTTC CTGAAGACCA 1140 

AGATGGGAGA GAGGCAGCGG GCAGGGGAGA TGACAGAGGT GCTO2CCAAC CAGAGGTACA 1200 

JCTCCCACCT GGTCCCTGAA GATCGGACCC TCACCTGCAG TGATOCTGGC ATCTATGTCC 1260 

TGCGGTTTGA CAACACCTAC AGCTTCATTC ATGCCAAGAA GGTCAATTTC ACTOTGGAGG 1320 

TCCTCCTTOC AGACAAAGCC TCAGAAGAGA AGATGAAACA GCTGGGGGCA GGCACCCCGA 1380 

AATAACACCT TCTCCTATAG CAGGCCTGGC CCCCTCAGTG TCTCCCTQTC AATTTCTACC 1440 

CCTIGTAGCA GTCATTTTCG CACAACCCTG' AAGCCCAAAG AAACTGGGCT GGAGGACAGA 1500 

CCTCAGGAGC TTTCATTTCA GTTAGGCAGA GGAAGAGCGA CTGCAGTGGG TCTCCGTGTC 1560 

TATCAAATAC CTAAGGAGTC CCCAGGAGCT GGCTGGCCAT CGTGATAGGA TCTGTCTGTC 1620 

CTGTAAACTG TGCCAACTTC ACCTGTCCAG GGACAGCGAA GCTGGGGGTG GCGGGGGGCA 1680 

TGTACCACAG GGTGGCAGCA GGGAAAAAAA TTMAAAAGG GTGAAAGATT GCC ACTTO AC 1740 

ACTTCAGGGA AGTCAGCTGC CGGGGAGAAA CTTGCTCCTA AATCAACACA TAAGTTTAGA 1800 

SCGCAATGAG GAGTAGCAGG GTAGCTGGTT GCTAGAGTTA GGGTGGGGAT CAGAAACTCT 1860 

TCCAAACATT TTAGCACTGA GGCTGGGGTA GCTTTTGGCT TTTCCCAGGT CTC AGGAGG T 1920 

GGCCTGAGTC AGCACACATC TTCCCACTCG GTAGACAGGC TCGCCTCTCC CTCACTTTGA 1980 

GACTTTGGCA ACTCCTGGGC CACACGGCCT GCCTCTTTGA TTACTAATGA TPGTCAGTGA 2040 

CTCAGAGCTT OCTGGGACTT CGGGTACCCA CCCGCTGTTC TCCATGCAAA CAAAGCGCCA 2100 

GGGAAATGAC CCACAGGGAT CGCAGCTGCA GGGAGGGCCA GGGAGGTTGG GGGTGGGAGT 2160 

GAATGCTAAA AGCAGATCGT CCAGTGCCCT TTTCAGTGCT ACCGGOCTCT CACCAAGCAG 2220 

TCCTCCATGT GAGCAACCCC GAGACAAAAA TGCIAAGTGG GATCAAGAGA GCAGCACTCG 2280 

GAGAGGGTGT TTGCCAGTCT GAGTGTCCOG CGGTGOOCGC CAACCCGCTT CCTGACTCAC 2340 

CTGAGCAAGG TCTTACTAAG CAGTCCCATC TCTGTGGGAG GCAXGCAACG CGTGCAGGGA 2400 

OHCAGGTGC CGGTCGGCGT AGCCAGGCCT GGAGGCCCCC CAGGCAGGAG GCCGCCCAAA 2460 

GGCGGGGOCG GCGTCTCGCA GACTAGGGGC TGGGGGOGGC CACAGACGGC CTCGAAACCA 2520 

CAGCCCTTAC CCCAATCCCA CGAGCCCCGC CAACG AACC A CAGGTGCTGG GCTTTAGAGA 2580 

ACATGGGAAG GCGGCCOCAG ACCTGGCGGG AACGCCTTTC CCTCAGAGCC AGGCCCCGGC 2640 

CCCGTCTGGG AAGCTCATCT TGCGAAGCTG AGGGAGCTCA GGGCAAAGGC CAGGCTAGCG 2700 

CGGACCGGAA GGGGCOGAGG CTGCACGGGC CTCTGCCAGA ACGCTCAGGA CATCCCGGCC 2760 
TGGGTTTACA ACGCTGTTAG GAAAATTAAC CAATGAATAA AGCAACGTTC AGTGCGCA 



SEQ ID NO:267 PBY9 Protein sequence: 
Protein Accession*: NP_036561 

MSORVGDLSP RQKEALAKFR ENVQDVLPAL PNPDDYFLLR WLRARSFDLQ KSEAMLRKHV 60 
EFRKQKDIDN DSWQPPEVI QQYLSGGMCG YDLDGCPVWY DHGPLDAKG LLFSASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETTT HYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF WKAPKLFPV AYNUKPFLS EDTRKKIMVL GANWKEVLLK HISFDQVPVE 240 
YGGTMTDPDG NPKCKSKINY GGDIPRKYYV RDQVKQQYEH SVQISRGSSH QVEYEILFPG 300 
CVLRWQFMSD GADVGPGIFL KTKMGERQRA CEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVLRFDNTYS FIHAKKVNFT VEVLLPDKAS EEKMKQU3AG TPK 



SEQ m NQ368 P BHS DMA sequence 
Nucleic Add Accesskmf: XM.009756 

Coding sequence 301-1440 (underlined sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

I I I I I I 

GTGGGGACAG CCGAGCCGCG CCGGGCCCCT GGACGGCGTC GCCAAGGAGC TGGGATCGCA 60 

CTTGCTGCAG ACTTTGGATG GATTTGTTTT TGTGGTAGCA TCTGATGGCA AAATCATGTA 120 

TAIATCCGAG ACCGCTTCTG TCCATTTAGG CTTATOCCAG GTGGAGCTCA CGGGCAACAG 180 

TATTTATGAA TACATCCATC CTTCTGACCA CGATCAGATG ACCGCTGTCC TCACGGCCCA 240 

OCAGOCGCTG CACCACCACC TGCTCCAAGG TATGAGATAG AGAGGTCGTT CTTTCTTCGA 300 

ATGAAATGTG TCTTGGOGAA AAGGAACGCG GGCCTGACCT GCAGCGGATA CAAGGTCATC 360 

CACTGCAGTG GCTACTTGAA GATCAGGCAG TATATGCTGG ACATGTCCCT GTACGACTCC 420 

TGCTACCAGA TTGTGGGGCT GGTGGCCGTG GGCCAGTCGC TGCCACCCAG TGCCATCACC 480 

GAGATCAAGC TGTACAGTAA CATGTTCATG TTCAGGGCCA GCCTTGACCT GAAGCTGATA 540 
TTCCWSGATT CCAGGGTGAC CGAGGTGACG GGGTACGAGC CGCAGGACCT GATCGAGAAG ' 600 

ACCCTATAOC ATCACGTGCA CGGCTGCGAC GTOTTCCACC TCCGCTACGC ACACCAOCTC 660 

CTGTTGGTGA AGGGCCAGGT CACCACCAAG TACTAOCGGC TGCTGTCCAA GCGGGGCGGC 720 

TGGGTGTGGG TGCAGAGCTA CGCCACCGTG GTGCACAACA GCCGCTCGTC CCGGCCCCAC 780 

TGCATCGTGA GTGTCAATTA TGTACTCACG GAGATTGAAT ACAAGGAACT TCAGCTGTCC 840 

CTGGAGCAGG TGTCCACTGC CAAGTCCCAG GACTCCTGGA GGACCGCCTT GTCTACCTCA 900 
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CAAGAAACTA GGAAATTAGT GAAACCCAAA AATACCAAGA TGAAGACAAA GCTGAGAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGG ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCAGTGACCT TCTGTACACQ CCATCCTACA GCCTGOCCTT CTCCTACCAT 1140 

TACGGACACT TCCCTCTGGA CTCTCACGTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT GAGGTGGCAC GCTTTTTCCT 6AGCACACT6 1260 

CCAGCCAGCG GTGAATGCCA GTGGCATTAT GCCAACCCCC TAGTGCCTAG CAGCTCGTCT 1320 

CCAGCTAAAA ATCCTCCAGA GCCACCGGCG AACACTGCTA GGCACAGCCT GGTGCCAAGC 1380 

TACGAAGGCA AGCAGATGTC CTCTGCGGAG ATACCGCCAG CTCCCCAGGA OGCAGACTGA 1440 
CTCCTGTTTG CTCGCTGGAC CAAC 



SEQ ID H0369 PBH8 Protein sequence: 
Protein Accession I: NP.005060 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SATTSQLDKA SHRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK IMVTSETASV HLGLSQVELT 120 
GNSIYEYMP SDHDEMTAVL TAHQPLHHHL LQEYTEERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK RQYMLDMSL YDSCYQIYGL VAVGOSLPPS AJTHKLYSN MFMFRASLDL 240 
KUFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHLU.VKGQV TTKYYRIXSK 300 
RGGWVWVQS Y ATWHNSRSS RPHOVSVNY VLTEEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK UtTNFYPFQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRPGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDG AR LALARAAFEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLKH PSPAATSPPG APLPHYLGAS 660 
VIITNGR 



SEP 10 NO:270 PBJ9 DNA sequence: 
Nucleic Acid Accession*: AA760894 

GGCACGAGGA GAAGATGTGG CTTGCTCATG CTTGACTTCT GCCATGGTTG TGAGGCCTCC 60 
CCAGCCATGT GGAACTGTIT TCAGGTGCTG GTTCCATGGC TCTTCCTGAG CCGAAAATAA 120 
GGAAACTCCA TAGACCTTGT CCACTGGAAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTG ATGGAT CTCTGCAGTA AGTGGAAGAG TTCTTCATGG CCCCCAAGGT TATATCCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGG AA ATAGTGCCTT TGTGGATATA AGTTAGGTAA 300 
AACTGAAGAT GAGATCATAC TGGATTAGGA TGGGATCTAA ATCCAATGAA AATGTCTTCA 360 
TAAAAAACAG GAAAGAACCC ATAGAAACAC AAGGAAGAAG GTCATGTGAA GATGGAGGCA 420 
GAGATTGGAG GGATGCAGCC ACCGGCCCAG GAATGCCAGC AGCCACCCAG AAGCTGGAAG 480 
GAAATGAGGG ATTCTCTCCT AGAACCTTTA GAGAGRACAT GGTCCTGTGA ACAGCTTGAT 540 
TTTGG ACTTG CCCATAGCTT GTATACTCTT ACTTTGG ATA CAATTTT ATC CAAACTTGGC 600 
TAAACAGTTT CTCAGCCTAT GGAAAATTTA AAATGGAGAA GATTCAACTC G ATTCTTACA 660 
GATTCAAAGC AAGAAAATGA TGGGAACATA GGAGGAGACC AAGAAAGCCT ATAAAAAGCA 720 
AAAATATGAA GTCAACA1TG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATGAA AACCCCCAAG GGGAATCCGC ATATCACAGT GTAGTGTGAT ATTTGACATT 840 
YGTGATCATY TAGAGATGTA CAGAAAAGGT GAATCTGTGT TCTGTATATT C TGCCT AAGG 900 
CAAAGAAATQ TTTAGCTYTC TTTAAAATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
GAAAACTGTA AGCTTOCCAT ATCTGG AGCA TTTCACTTTA AATATTTGGA TAAATATGTT 1020 
ATCTTCTTAC TTGG ACATTT CATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 
TATAGCTGCT AACACTTCCC GCAGAGCTAA ACCATTACAG ANTATGAAAT AAAGACCCTA 1140 
TTGATTTGAA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAAT GA 

SEQ ID MO:271 PB04 DNA saauence 
Nxleic Acid Accessor* AA149579 

Coding sequence 1-13S3 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 SI 

I I I I I I 

ATGGAATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTG TTTACCTAAT 60 

GGCATAAATO GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGTGGAGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCATGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGMTTTTTT CCTCATGTGG TAGATGTCAC TCATCATGAA 240 

GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

CTGTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATGTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTQTCTCA GCTTGGGCAC TTCAGTTAGG ACCTAAGGAT 480 

GCCAGCCGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

CTTGCCCGCC AGTTGAATTT CATTOCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAQ 600 

ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT CCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTA CT TTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TCCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGG AGAAT TGAAATGTAT 1080 

ATCTCCTTTG GCATAATGAG CCTTGGCTTA CTTTCCCTCC TGGCAGTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTTTTA TTCWSTCTAC ACTTGGATAT 1200 
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V 



TCATAAGTAC TTTCCATGTT TTAATTTATG GATGGAAACO AGCTTTTGAG 1260 
GAAGAGTACT ACAGATTTTA TACACCACCA AACTTTGTTC TTGCTCTTGT TTTGCCCTCA 1320 
ATTGTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG ACTGA 



SEQ ID Ham PBQ4 Protein sequence 
Protein Accession * none 



1 11 21 31 41 SI 

I I I I I.I 

HBSISKMGSP KSLSETCLPN OmSIKUAHK VTVGVXGSGD FAKSLTIRLI RCGYHWIGS 60 
RNPKPASEFP PHWDVTHHE DALTKTOIIP VAIHREHYTS LWDLRHLLVG KILIDVSKNM 120 
RINQYPESNA EYLASLFPDS LIVKGFNWS AHALQLGPKD ASRQVYICSN NIQARQQVXE 180 
LARQLNFIPI DLGSLSSARE IENLPLRLFT LWRGPVWAI SLATFFPLYS PVRDVIHPYA 240 
RNQQSDFYKI PIEIVNKTLP IVAITLLSLV YLAGLLAAAY QLYYGTKYRR FPPWLETWLQ 300 
CRKQLGLLSF FPAMVHVAYS LCLPMRHSER YLFLNMAYQQ VHANIEHSWN EEBVKRIEKY 360 
ISFGIMSLGL LSLIAVTSTP SVSNALNMRB FSFZQSTL6Y VALLISTFHV LIYGWKRAFE 420 
EEYYRFYTPP NFVLALVLPS IVTLDHjQLC RYPD 

SEQ ID N0273 PBQS DMA SEQUENCE 

NuctelcAddAccessloni: NM.001973 

Coding sequence: i&imiwfeOwisafienmoanesp^tottsid^aofoiii 



1 11 21 31 41 51 

I I I I I I 

CCGCCGCCTT CTACTCCGCC GCGGGGGTCG CAGGGGCTGC CGCGCCGTCC TCGAGTTTCC 60 

AGCGTGAGGA GGAGGCTGAG GGCGGAGAGG CGCATCGTGT TCGAGGCGGA GACCGAGGGG 120 

GAGCCOCGCG CGCGGCGTCG CTCATTGCTA TGGACAGTGC TATCACCCTG TGGCAGTTCC 180 

TTCTTCAGCT CCTGCAGAAG cctcagaaca AGCACATGAT CTGTTGGACC TCTAATGATG 240 

GGCAGTTTAA GCTTTTGCAG GCAGAAGAGG TGGCTCGTCT CTGGGGGATT CGCAAGAACA 300 

AGCCTAACAT GAATTATGAC AAACTCAGCC GAGCCCTCAG ATACTATTAT GTAAAGAATA 360 

TCATCAAAAA AGTGAATGGT CAGAAGTTTG TGTACAAGTT TGTCTCTTAT CCAGAQATTT 420 

TGjAACATGGA TCCAAT6ACA GTGGGCAGGA TTGAGGGTGA CTGTGAAAGT TTAAACTTCA 480 

GTGAAGTCAG CAGCAGTTCC AAAGATGTGG AGAATGGAGG GAAAGATAAA CCACCTCAGC 540 

CTGGTGCCAA GACCTCTAGC CGCAATGACT ACATACACTC TGGCTTATAT TCTTCATTTA 600 

CTCTCAACTC TTTGAACTCC TCCAATGTAA AGCTTTTCAA ATTGATAAAG ACTGAGAATC 660 

CAGCCGAGAA ACTGGCAGA6 AAAAAATCTC CTCAGGAGCC CACAOCATCT GTCATCAAAT 720 

TTGTCACGAC ACCTTCCAAA AAGCCACCAS TTGAACCTGT TGCTGCCACC ATTTCAATTG 780 

GCCCAAGTAT TTCTCCATCT TCAGAAGAAA CTATCCAAGJC TTTGGAGACA TTGGTTTCCC '840 

CAAAACTGCC TTCCCTGGAA GCCCCAACCT CTGCCTCTAA CGTAATGACT GCTTTTGCCA 900 

CCACACCACC CATTTCGTCC ATACCCCCTT TGCAGGAACC TCCCAGAACA CCTTCACCAC 960 

CACTGAGTTC TCACCCAGAC ATCGACACAG ACATTGATTC AGTGGCTTCT CAGCCAATGG 1020 

AACTTGCAGA GAATTTGTCT CTGGAGCCTA AAGACCAGGA TTCAGTCTTO CTAGAAAAGG 1080 

ACAAAGTAAA TAATTCATCA AGATCCAAGA AACCCAAAGG GTTAGGACTG GCACCCACCC 1140 

TTGTGATCAC GAGCAGTGAT CCAAGCCCAC TGSGAATACT GAGCCCATCT CTCCCTACAG 1200 

CTTCTCTTAC ACCAGCATTT TTTTCACAGA CACCCATCAT ACTGACTCCA AGCCCCTTGC 1260 

TCTCCAGTAT CCACTTCTGG AGTACTCTCA GTCCTGTTGC TCCCCTAAGT CCAGCCAGAC 1320 

TGCAAGGTGC TAACACACTT TTOCAGTTTC CTTCTGTACT GAACAGTCAT GGGCCATTCA 1380 

CTCTGTCTGG GCTGGATGGA CCTTCCACCC CTGGCCCATT TTCCCCAGAC CTACAGAAGA 1440 

CATAACCTAT GCACTTGTGG AATGAGAGAA GCGJAGGAACG AAGAAACAGA CATTCAACAT 1500 

GATTGCATTT GAAGTGAGCA ATTGATAGTT CTACAATGCT GATAATAGAC TATTGTGATT 1560 

TTTGCCATTC CCCATTGAAA ACATCTTTTT AGGATTCTCT TTQAATAGGA CTCAAGTTGG 1620 

ACTATATGTA TAAAAATGCC TTAATTGGAG TCTAAACTCC ACCTCCCTCT GTCTTTTCCT 1680 

' 1 TTCTTTTTC TTTCCTTCCT TCCTTTTCTT TTCTCCTTTA AAAATATTTT GAGCTTTGTG 1740 

CTGAAGAAGT TTTTGGTGGG CTTTAGTGAC TGTGCTTTGC AAAAGCAATT AAGAACAAAG 1800 

TTACTCCTTC TGGCTATTGG GACCCTTTGG CCAGGAAAAA TTATGCTTAG AATCTATTAT 1860 

TTAAAGAAGT ATTTGTGAAA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AAA 



SEQ ID HO:Z74 PBQ5 Protein sequence: 
Protein Accession*. NP.0019S4 

MDSAITLWQF LLQLLQKPQN KHMICWTSND GQFKLLQAEE VAKLWGIRKN KPNMNYDKLS 60 
RALRYYYVKN IIKKVNGQKF VYKFVSYPEI LNMDPMTVGR IEGDCESLNFSEVSSSSKDV 120 
ENGGKDKPPQ PQ AKXSSRND YWSGLYSSF TLNSLNSSNV KLFKLUCTEN PAEKLAEKKS 180 
PQEPTPS V1K FVTTPSKKPP VEPVAATISI GPSEPSSEE TIQALETLVS PKLPS LEAPT 240 
SASNVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDS VASQPM ELPENLSLEP 300 
KDQDSVLLEK DKVNNSSRSK KPKGLGLAPT LVITSSDPSP LOILSPSLPT ASLTPAFFSQ 360 
TPIILTPSPL LSSIHFWSTL SPVAPLSPAR LQGANTLFQF PS VLNSHGPF TLSGLDGPST 420 
PGPFSPDLQKT 



SE0 ID NO-.27S PBY3 DNA SEQUENCE 

Kucleh: Add Accession*: AB040921 

Coding sequence: 131-2560 (underlined sequexe corresponds lo start and stop codon) 

1 11 21 31 41 51 

I I I I I I 
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AAICAGGAAC AGAICATATA TTGACCGAGA TTCTGAOTAT CTCTTGCAAG AAAATGRACC 60 

AGATGGAACT TTAGACCAAA AATTATTGGA AGATTTACAA AAGAAAAAAA ATGACCTTCG 120 

GTATATTGAA ATGCAOCATT TCAGAGAAAA GCTGCCTTCG TATGQAATGC AAAAGGAATT 180 

GGTAAATTTA ATTGATAACC AICAGGTAAC AGTAATARGT GGTGAAACTG GTTGTGGCAA 240 

AACCACTCAA GTTACTCAGT TCATTTTGGA TAACTACATT GAAAGAGGAA AAOGATCTGC 300 

TTOCAQAATA GTTTGTACTC AGCCAAGAAG AATTAGTSCC ATTTCAQTTG CGGAAAGAGT 360 

AGCTGCAGAA AG6GCA6AAT CTTGTGGCAG TGGTAATAST ACTG6ATATC AAATTCGTCT 420 

CCAGAGTCGG TTGCCAAGQA AACAGGGTTC TATCTTATAC TGTACAACAQ GAATCATCCT 480 

TCAGTGGCTC CAGTCAGACC CGTATTTGTC CAGTGTTRGT CATATCGTAC TTGATGAAAT 540 

CCATGAAAGA AATCTGCAGT CAGATGTTTT AATGACTGTT GTTAAAGACC T TCTCAA TTT 600 

TCOATCTOAC TTGAAAQTAA TATTGATGAG TGCAACATTG AATGCAGAAA AQTTTTCAGA 660 

ATATTTTGQT AACTGTCCAA TGATACATAT ACCTGGTTTT ACCTTTOCGQ TTGTGGAATA 720 

TCTTTTGSAA GATGTAAITO AAAAAATAAG GTATGTTCCA GAACAAAAAQ AACACAGATC 780 

CCAGTTTAAO AGGGOTTTCA TGCAAGGGCA TGTAAATAGA CAAGAAAAAS AAGAAAAAGA 840 

AGCAATATAT AAAGAAOGTT GGCCAGATTA TGTAAGGGAA CTGCGAAGAA GGTATTCTGC 900 

AAGTACTGTA GATGTTATAG AAATGATGGA GQATGATAAA GTTGATCTGA ATTTGATTOT 960 

TGCCCTCATC CGATACATTG TTTTGGAAGA AGAGGATGGT GCGATACTGG TCTTTCTGCC 1020 

AGGCTCGGAC AATATCAGCA CTTTACATGA TCTCTTGATG TCACAAGTAA TGTTTAAATC 1080 

AGATAAATTT TTAATXATAC CTTTACATIC ACTGATGCCT ACAGTTAACC AGACACAGGT 1140 

GTTTAAAAGA ACCCCTCCTG GTGTTCGGAA AATAGTA&TT GCTACCAACA TTGCGGAOAC 1200 

TAGCATTACC ATAGATGATG 1CGTTTATOT GATAGATGGA GGAAAAATAA AAGAGAOGCA 1260 

TTTTGATACT CAGAACAATA TCAGTACAAT GTCCGCTGAG TGGGTTAGTA AAGCTAATGC 1320 

CAAACAGAGA AAAGGTCGAG CTGGAAGAGT TCAAOCTGGT CATTGCTATC ATCTGTATAA 1380 

TGGTCTTAGA GCAAGTCTTC TAGATGACTA TCAACTGCCA GAAATTTTGA G AACT CCTTT 1440 

GGARGAACTT TGTTTACAAA TAAAGATTTT AAGGCTAGGT GGAATTGCTT ATTTTCTGAG 1500 

TAGATTAATG GACCCACCAT CAAATGAGGC AGTGTTACTC TCCATAAGAC ACCTGATGGA 1S60 

GCTGAACGCT TIGGATAAAC AAGAAGAATT GACACCTCTT GGAGTCCACT TGGCACGATT 1620 

ACCCGTTGAG CCACATATTG GAAAAATGAT TCTTTTTGGA GCACTGTTCT GCTGCTTAGA 1680 

CCCAGTACTC ACTATTGCTG CTAGTCTCAG TTTCAAAGAT CCATTTGTCA TTCCACTGGG 1740 

AAAAGAAAAG ATTGCAGATG CAAGAAGAAA GGAATTGGCA AAGGATACTA GAAGTGATCA 1800 

CTTAACAGTT GTGAATGCGT TTGAGGGCTG GGAAGAGGCT AGGCGACGTG GTTTCAOATA 1860 

CGAAAAGGAC TATTGCTGGG AATATTTTCT GTCTTCAAAC ACACTGCASA TGCTGCATAA 1920 

CATGAAAGGA CAGTTTGCTG AGCATCTTCT TGGAGCTGGA TTTGTAAGCA GTAGAAATCC 1980 

TAAAGATCCA GAATCTAATA TAAATTCAGA TAATGAGAAG ATAATTAAAG CTGTCATCTG 2040 

TGCTCGTTTA TATCCCAAAG TTGCTAAAAT TCGACTARAT TTGGGTAAAA AAAGAAAAAT 2100 

GGTAAAAGTT TACACAAAAA CCGATQGCCT GGTTGCTGTT CATCCTAAAT CTGTTAATGT 2160 

GGAGCAAACA GACTTTCACT ACAACTGGCT TATCTATCAC CTAAAGATGA GAACAAGCAG 2220 

TATATACTTG TATGACTGCA CAGAGGTTTC CCCATACTGT CTCTTGTTTT TTGGAGGTGA 2280 

CATTTCCATC CAGAAGGATA ACGATCAGGA AACTATTGCT GTAGATGAGT GGATTGTATT 2340 

TCW3TCTCCA GCAAGAATTG CCCATCTTGT TAAGGAATTA AGAAAGGAAC TAGATATTCT 2400 

TCTGCAAGAG AAGATTGAAA GTCCTCATCC TGTAGACTGG AATGACACTA AATCCAGAGA 2460 

CTGTGCAGTA CTGTCAGCTA TTATAGACTT GATCAAAACA CAGGAAAAGG CAACTCCCAG 2S20 

GAACTTTCCG CCACGATTCC AGGATGGATA TTACAGCTGA CAGCTTTTCA GGGGTGGTCT 2S80 

GAAAAGCCAG TTTGACAGCC ATTCTTCATC ATTGTTTAAA TTTTGGCTGG ATGCC AAACC 2640 

CTGGGACATG AACAATTTTC ATGTGTAAGG TAGAAGCCTT CAGTAGGTAG TAAAGACTOA 2700 

ATGTGCAIGA CTTGATGTTA TATGTAGAGA TATATATATA TATATATATA CCATAAAAGC 2760 

AATATGTTCT CTGATCATAT ACTCTGCTGT GGTCATGCCC ACTCTTTGGG AGTATATTCC 2820 

CTTTATATAT ATTGAGTATT GTACCACTTG AGAAATTCCT TTGTTCTGTT AXACAAAATT 2880 

AATCTTTCTG CTCATAATGA TTGATGATAC CACCAGTAAA AATAGGATGT TTACCCCAAA 2940 

ACAAGTGTCA ATTAAGAATT TGAACACAAC CACATTTTTT AAAATGAAAC TTCTATCGGA 3000 
AGTAAATTAA TTTGTTGTAA TAAAGTCCAG TATTTAATAA AATGTACAAT GTTAAATCTC 

sfo IP H(M7g PBY3 Protein seouence: 
Protein Accession!: BAA96012 

IRNKSYTDRD SEYLLQENEP DGTLDQKLLE DLQKKKNDLR YEMQHFREK LPSYGMQKEL 60 
VNIJDNHQVT VISOETGCGK TTQVTQFILD NYIERGKGS A OUVCTQPRR ISAISVAERV 120 
AAERAESCGS CNSTGYQIRL QSRLPRKQGS ILYCTTGnL QWLQSDPYLS SVS HTVLD E1 180 
KERNLQSDVL MTWKDLLNF RSDLKVILMS ATLNAEKFSE YK3NCPMIH1 PGFTTPWEY 240 
LLEDVIEKIR YVPEQKEHRS QFKRGFMQGH VNRQEKEEKE AIYKERWPDY VRELRRRYS A 300 
STVDVIEMME DDKYDLNUV ALIRYIVLEE EDGAILVFLP GWDNISTLHD LLMSQVMFKS 360 
DKFUIPLHS LMPTVNQTQV FKRTPPGVRK IVIATNIAET SITIDDWYV IDGGKUCETH 420 
FDTQNN1STM SAEWVSKANA KQRKGRAGRV QPGHCYHLYN GLRASLLDDY QLPEILRTPL 480 
EELCLQIKIL RLGQIAYFLS RLMDPPSNEA VLLSIRHLME LNALDKQEEL TPLGVHLARL 540 
PVEPHIGKMI LFGALFCCLD PVLTIAASLS FKDPFVIPIjG KEKIADARRK ELAKDTRSDH 600 
LTWNAFEGW EBARRRGFRY EKDYCWEYFL SSNTLQMLHN MKGQFAEHLL GAGFVSSRNP 660 
KDFESNINSD NEKIUCAVIC AGLYPKVAKI RLNLGKKRKM VKVYTKTDGL VAVHPKSVNV 720 
EQTOFHYNWL IYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNDQE TIAVDEWIVF 780 
QSPARIAHLV KELRKELDIL LQEKESPHP VDWNDTKSRD CAVLSATEDL KTQEKATPR 840 
NFPPRFQDGYYS 

SE0.IDN0277PBY60NA SEQUENCE 

Nude* Acid Accession*: AA464018 

Coding sequence: 64-1669(undertlned sequence corresponds to start and stop codon) 



GATTTTATCC TGGAACATTA CAGTGAAGAT GGCTATTTAT ATGAAGATGA AATTGCAGAT 60 
fTTATGGATC TGAGACAAGC TTGTCGGACG CCTAGCCGGG ATGAGGCCGG GGTGGAACTG 120 
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CIGATGACAT ACTTCATCCA GCTGGOCTTT GTCGAGAGTC GATTCTTCCC GCCCACACGG 180 
CAGATGGG AC TCCTGTTCAC CTGGTATGAC TCTCTCACCQ GGGTTCCGGT CAGCCAGCAG 240 
AACCTGCTGC TOG AO AAGOC CAGTOTCCTO TICAACACTO GGGCCCTCTA CACC CAGAT T 300 
GGGACCCGGT GTGATCGGCA OACGCAGGCT GGGCTGG AGA GTGCCATAGA TGCCTTTCAG 360 
AGAGCCGCAQ GGGTTTTAAA TTACCTGAAA OACACATTTA CCCATACTCC AAGTTAOQAC 420 
ATGAGCCCTG OCATGCTCAG CGTGCTCGTC AAAATGATGC TTOCACAAGC CCAAGAAAGC 480 
GTOITTQ AG A AAATCAGGCT TCCTGGG ATC CGGAATG AAT TCTTCATGCT GGTGAAGGTG 540 
GCTCAGG AGO CTGCTAAGGT GGGAG AGGTC TACCAACAGC TACACXJCAGC CATOAGOCAQ 600 
GCGCCGGTGA AAG AGAACAT CCCCTACTCC TGGGCCAGCT TAGCCTGCGT GAAGGCCCAC 660 
CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGAGCA CCAGGTGAAQ 720 
CCAGGCACGG ATCTGGACCA CCAGGAGAAG TGCCTGTCCC AGCTCTACGA CCACATGCCA 780 
GAGGGGCTGA CACCCTTGGC CACACTGAAG AATGATCAGC AGCOCCGACA GCTGGGGAAG 840 
TOGCACTTGC GCAGAGCCAT GGCTCATCAC GAGGAGTCGG TGCGGGAGGC CAGCCTCTGC 900 
AAGAAGCTGC GGAGCATTGA GGTGCTACAG AAGGTGCTGT GTGCCGCACA GO AACGCTCC 960 
CGGCTCACGT ACGCCCAGCA CCAGGAGGAG GATGACCTGC TGAACCTGAT CGAOGCOCCC 1020 
AGTGTTGTTG CTAAAACTX3A GCAAGAGGTT GACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG ACTTCTTCCA GAAGCTCGGC CCCTTATCTG TGTTTTCGGC TAACAA GCGO 1 140 
TGGACGCCTC CTCGAAGCAT CCGCTTCACT GCAGAAG AAG GGGACTTGGG GTTCACCTTG 1200 
AGAGGGAACG OCOCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTOGGTGGCA 1260 
GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTQ TGGATTGTAA GTGGCTOACG 1320 
CTGAGTGAGG TTATGAAGCT GCTGAAGAGC TTTGGCGAGG ACQ AGATCGA GATGAAAGTC 1380 
GTGAGCCTCC TGGACTCCAC ATCATCCATG CATAATAAGA GTGOCACATA CTCCGTGGGA 1440 
ATCCAGAAAA CGTACTCCAT GATCT G CTTA GCCATTGATG ATGACGACAA AACT0ATAAA 1500 
ACCAAGAAAA TCTCCAAGAA GCTTTCCTIC CTGAGTTGGG GCACCAACAA GAACAGACAG 1560 
AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
AAGCTGCCCT CCCCTTTCAG CCTTCTCAAC TCAGACAGTT CTTGGTACIA_A 



SEQ ID N0378 PBY6 Protein sequence: 
Protein Accession* NP.149094 

DHLEHYSED GYLYEDEIAD LMDLRQACRT PSRDEAGVEL LMTYHQLGF VESRFFPPTR 60 
QMGLLFTWYD SLTGVPVSQQ NLUJEKASVL FNTQALYTQI GTRCDRQTQA GLESAIDARJ 120 
RAAGVLNYLK DTFIHTPSYD MSPAMLS VLV KMMLAQAQES VFEHSLPGI RNEFFMLVKV 180 
AQEAAKVGEV YQQLHAAMSQ APVKENIPYS WASLACVKAH HYAALAHYFT AILLIDHQVK 240 
FGTDLDHQEK CtSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRKAMAHH EESVREASLC 300 
KKLRSEVLQ KVLCAAQERS RLTYAQHQEE DDLLNUDAP SWAKIEQEV DDLPQFSKL 360 
TVTDFFQKLG PLSVFS ANKR WTPPRSIRFT AEEGDLGFTL RGNAPVQVHF LDPYCS ASVA 420 
GAREGDYTVS IQLVDCKWLT LSEVMKLLKS FGEDEEMKV VSLLDSTSSM HNKSATYSVG 480 
MQKTYSMICL AIDDDDKTDK TKKISKKLSF LS WGTNKNRQ KS ASTLCLFS VGAARPQVKK 540 
KLPSPFSLLN SDSSWY 



SEQ ID N0279 PBY8 DNA SEQUENCE 

Nucleic Add Accessta* AF107493 

Coding sequence: 1 25-556 (undated sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

I I I I I I 

GAATTCGGCA CGAGCCTTGT TGGAGGTTCT GGGGCGCAGA ACCQCTACTG CTGCTTOGGT 60 

CTCTCCTTGQ GAAAAAATAA AATTTGAACC TTTTGGAGCT GTGTGCTAAA TCTTCAGTGG 120 

GACAATGGGT TCAGACAAAA GAGTGAGTAG AACAGAGCGT AGTGGAAGAT ACGGTTCCAT 180 

CATAGACAGG GATSACCGTG ATGAGCGTGA AfCCCGAAGC AGGCGGAGGG ACTCAGATTA 240 

CAAAAGATCT AGTGATGATC GGAGGGGTGA TAGATATGAT GACTACCGAG ACTATGACAG 300 

TCCAGAGAGA GAGCGTGAAA GAAGGAACAG TGAOCGATCC GAAGATGGCT ACCATTCASA 360 

TOGTGACTAT GGTGAGCACG ACTATAGGCA TGACATCAGT GACGAGAGGG AGAGCAAGAC 420 

CATCATGCTG CGCGGCCTTC CCATCACCAT CACAGAGAGC GATATTCGAG AAATGATGGA 480 

GTCCTTCGAA GGCCCTCAGC CTGCGGATGT GAGGCTGATG AAGAGGAAAA CAGGTGAGAG 540 

CTTGCTTAGT TCCTGATATT ATTOTTCTCT TCCCCATTCC CACCTCAGTC CCTAAAGAAC 600 

ATCGTGATTC CCCCAGTCTT CAAGCACATG AATTCAGAAT GAAAGGTTTG CCATGGCTA A 660 

GGAATGTGAC TCTTTGAAAA CCATGTTAGC ATCTGAGGAA CTTTTTTAAA CPTTGTm'A 720 

GGGACTTTTT TTTCCTTAGG TAAGTAATGA TTTATAAACT CCTTTTTTTT TTTGACTATA 780 

GTCGGTTGCA TGGTTACTTT AAGCGTGGAA TCAAATGGAG TGGCATTTAG TTCAGGCGGC 840 

TTGTTCCTTG CCATGGCAAA GTATCAAGAA GATCCCCAAG TCAAGTCACA TTTGTAAAGC 900 

TGCTTCCCAA TTGGCTTTGT CACGCAGTGT TGAAGCAGTG GGAGAGAGAT TCACCTGTTA 960 

TAAAGGAACT GACTAACACA AGTATCCCGT CTATATCTGA ATGCTGTCTC TAGGTGTAAG 1020 

CCGTGGTTTC GCCTTCGTOG AGTTTTATCA CTTGCAAGAT GCTACCAGCT GGATGGAAGC 1080 

CAATCAGGTT GCTTCACTCA CCAAGTCTAG AIATTCATGA AAATGGAACA AGTCTGTACA 1140 

ATTTTAAAAA AAGGTTGAAG GACTGG TT TG TTCCAAAGGA GTGACTTTTT TTTAAAAAAA 1200 

AAGCTTTGTA TATATTAAAA TTGATGTTAC TAGAATAAGT ACAGTACCAA GGACTTCATT 1260 

ATAGAATTTG TTCTGCCTTT AAACATGGCT ACCTACCTGG CAGGGCTTTG TTAACTACTG 1320 

AATACCTGTC TGGTAATCAC TAAAACATCT TTATGTTTCC CTTTTTTCTA GTTTGTTATA 1380 

TTCCTATTAT GTCCATTGAG AGTAAGCTTA GTATATCAAA CTCTCCATTT GACAGTGAAG 1440 

AGAACATAGT GAAAGTCTGT GGCGGCATTT TTATAAGTAA TTCCTTATTT CTGCCTGAAG 1500 

ACCACAAAGC CTCCTGGAGG CGTAACTGCT CAGACCGGTC TTCAOGGAAT ATTTAAGGAC 1560 

TTAGTGGAAT TTATGAACAA TAAGTCTGAT GAGATTAGCC TGGSMTGGT GTCCTGCAGC 1620 

TGTCTAATCT AGAGTGGCAT TAACATTCTA ATCTCCTTGA GAATGCCTTT TATA GTCT GT 1680 

TCAAAGCAAG TCATTGATGG TTCTTCGAGG TAGTGTTAAC TGAAGTGTTC TTCAGTTTGT 1740 

CAAGATAATG TTCAGTGCTT GGCACTTAAA TAACATTTTT TGCAAGAACT CCAAGGCACA 1800 
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TTATTOAATG CCTTTAACCA AGTGCATTCT GGGAAGTTTG CTTGACTCAT TATCTTGCTT 1860 

TTCTGCAGCA TTCTGTGATT TGAGTCATCC ATGAATCCAT GAATAAAAQT TACATTCTTT 1920 

GATTGGTAAT ATTGCCATTT ATAACAAGAC TCACTAATGA GGGTATCACT TTGACTGACT 1980 

GATTTOTrAA AGTTTTTAAG CCTCTCATTT TCCTAACCCA GAAATCACAG CCTGATTTTA 2010 

TTAAAAGTAQ AGCTTCATTC ATTTCATACC ATAGATACCA T OCTAOTA AA TCCAGAACAT 2100 

ATACAAGGTT CATGTGAOTC TGCTTTCTTO ACATGATAGC ATTGTTTGAT GCAGTGGATA 2160 

TGTCAGAATG ACTAACCTAG GASTTTGAAA CTCCTAAGAA ACTAAAACCT GTAAGACATT 2220 

TAAAAGTCTC CACAATTTTA ATGTAXACAA AfiCTATGTTA CTGTGTAACA CATTACAGTT 2280 

CAAATTCACT CCA6AAATAA AAGGCCAGTA GQATTAGGGA CTCACTGGTA GTTTGGAGTC 2340 

TOCCAGCACA CATCCCTCCT AGTGGGATGA TCTATTCACA TATCTCCCAG CTTTTTTATT 2400 

TTTGCTTCTG TATATCACAG TGAOT6QAT6 OCCCTTCAGC TTTTTCTCTC CTGGCCA6AC 2460 

ATGCAGTCTT GOCTTTAGAT ATCGCAGAGA CAAAATTCAC AGCATGTCTT AAATCTTCCA 2520 

GGATTTGCAA GAAOCAAATT GCTCAACAGT ATGTATGTTT AGAGGGGTTA GACTCCTTTT 2580 

TAAAATCTGG ATATCTAACC ACCTACTTAA ATCTGTTTGA TAGTGTCAAA OCACCCCCAC 2640 
CCTTGATCCT CCCAOCCCCA AAAAAAAAAA AAAA 



SEQ ID NCh28Q PBY8 Proidri seouence: 
Protein Accession I: XP.0O3261 

MGSDKRVSRT ERSGRYGSn DRDDRDERES RSRRRDSDYK RSSDD RRGD R YDDYRDYDSP 60 

ERERERRNSD RSEDGYHSDG DYGEHDYRHD ISDERESKTI MLRQUTTIT ESDIREMMES 120 
FECPQPADVR LMKRKTGESL LSS 

SEQ ID M0281 P02 DMA SEQUENCE 

Nucleic Add Accesslorft AF208291 

OxEng sequence 10W05(undetlkmd sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTGAGOC TGAGGGGGCC GA6CTCGGGC 60 

GCCGCGTTCC CTTCTCCGTT GCCATGAACC GCGGACACCC CGGCCCC GAT GG CCCCCGTG 120 

TACOAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA AICAASTOCC 180 

TTCTGTAGTG TGAAGAAACT AAAAGTAGAG CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAGAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTCCTT GCCGGTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTCCCAG GAAGCACCGG GCACATCGTG GTCACCTCAfi CAABCAGCAC TTCTGTCACC 420 

GGGCAAGTCC TCGGOGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480. 

ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATG CAAGCGGGGC CACTGTCGCC 600 

ACTGCCACCA CGTCTACTGC CACCTCCAAA AACAGCGGCT CCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA OCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTPTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 780 

GTAGCCATCA AGATCCTGAA GAACCGCCCA TCCTATGCCC GACAAGGTCA GATTGAAG7G 840 

AGCATOCTGG CCCGGTTGAG CACGGAGAGT GCCGAIGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC CCCTTGCCCC TCAAATACAT TO3CCCAGTT 1020 

CTCCAGCAGG TAGOCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTGAC 1080 

CTCAAACCAQ AAAACATCAT GCTCGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG CCACGTCTCC AAGGCTGTGT GCTCCACCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCCCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATG 1260 

TGGTCCCTGG GCTGTGTTAT TGCAGAATTG TTCCTGGGTT GGCCGTTATA TCCAGGAGCT 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAAOCGTG ACACGGACTC ACCATATCCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GCGACATGTT GGTAGAAAAG GCTGACCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

CCCTTTGTC A CCATGACACA CTTACTCGAT TTTCOCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGOGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACCCATCTA CACTCTACCA GCCCTCAGCG 1980 

GCATCCATCG CTGCAGTGGC CCAGCGGAGC ATGCCCCTGC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCOGTT CCAGCAAGCT CTCATCGTGT GTCCCCCCGG CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGSAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCCCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTGCTTCCCC CAGCATGGCA GCAACTGACT 2280 

GGAGTGGCCA CCCACACATC AGTGCAGCAT GOCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTC GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATC GACCGGTCAT GTGACCCT1C CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGK3AT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTG AGGTGTCCTC CTCTCAGGCC 2580 

ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GTCACCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGOCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAGTGACACG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT CCACGACTCC 2880 

CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTOCG TGCAGCAGCG TGCTGGGCAC 2940 



419 



WO 02/30268 



PCT/US01/32045 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



AACAATGCCA 
CGAACCATCA 
AGCCTGGTGC 
AACGTGACCT 
CAGCAGCGGC 
CAGCACATCA 
ACCATGGCCC 
CCGCA1CTGG 
TACACTGCGC 
GGCTCTGCGC 
CCCGTGAGCA 
GCCCAATTTG 
TACCCACTGA 
GAGGGAGGGA 
CCXGGGACCG 
GGGCAGGGGC 
CTTGAACCGG 
TTAAAGAGGG 



ATGOCTTTGA 
TCGTGCCACC 
CAGTCAACAC 
CCACCAGCGG 
GGGGOOCCCA 
CCACGGACCG 
AGGCTCCGTA 
CTGCACCCGC 

GCCACACCGT 
TGGGCCCCCG 
CCCACCAGAC 
GCCCCGCCAA 
GGGAGGGASA 
TGGGCGCTGG 
GGGGGGGGGG 
GAAGTGGGAG 
TGGGAAATCT 



CACCAAGGGG 
CCTGAAAACC 
CAGTCACCAC 
TCACTCTTCA 
CTTCCAOCAO 
CACTGGQAGC 
CTCCTTCCCG 
TGCCOCTOCC 
GGGCTCCACC 
GCAGCACACT 
GGTCCTGCCC 
CTACATCAGC 
GGTCAACCAQ 
GAATGGCCCG 
CCTTTTATAC 
GGGGCAGAOG 
GACGTAGAGC 
ATGGTTTTTA 



AGCCTGGAGA 
CAGGCCAGO0 
TCGTCCTCCT 
GGGAGCTCAT 
CACCAGCCAC 
CACCOAAGGC 
CACAACAGCC 
CACCTCCCCA 
6GCACCGTG6 
GCCTACCCAS 
TCGCCCACCA 
GCCTCGCCAO 
TACCCTTACA 
AGGGAGGAGG 
TQAAQATGCC 
GCAGGGGGAC 
AGA6AAGAGA 
TTrTAAAAAA 



ATCACTGCAC 
AAGTATTGGT 
ACAAGTCCAA 
CTGGAGCCAT 
TCAATCTCAG 
AGCAGGCCTA 
CCAGCCACGO 
CCCAGCCCCA 
CCCAGCTGGT 
CCAGCATCGT 
TCCACCCGAG 
CCTCCACCGT 
TATAAACACT 
GAGAGAAGGA 
GCACACAAAC 
GGGTCGGGAC 
ACATTTTTAA 



GGGGAAOOOC 
GGAGTGTGAT 
GTCCTCCAGC 
CACCTACCGG 
CCAGGCTCAG 
CATCACTOCC 
CACTOTQCAC 
CCTCTACACC 
GGCCTCGCAA 
CCACCAGGTC 
TCAGTATCCA 
CTACACTGGA 
GGAGGGGAGG 
GGGAGGCGCT 
AATGCAAAC6 
ACCAGTQAAA 
AAGGAAGGGA 



3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 



SEQB>Nft282I 
Protein Accession* 



NPJH3577 



MAPVYEGMAS HVQVFSPHTL QSSAPCSVKK LKVEPSSNWD MTGYGSHSKV YSQSKNIPPS 60 
QPA S 'IT VSI S UVPNPSLPY EQTtVFPGST GHIWTSASS TSVTGQVLGG PHNLMRRSTV 120 
SLLDTYQKCG LKRKSEHEN TSS VQIIEEH PFMIQNNASO ATVATATTST ATSKNSGSNS 180 
EGDYQLVQHE VLCSMTNTYE VLEFLGROTF GQWKCWKRG 1NEIVAKIL KNRFSYARQG 240 
QIEVSILARL STES ADDYNF VRAYECFQHK NHTCLVFEML BQNLYDFLKQ NKFSPLPLKY 300 
IRPVLQQVAT ALMKLKSLOL IHADLKPENI MLVDPSRQPY RVKVIDFGSA SHVSKAVCST 360 
YLQSRYYRAP EIUjGLPFCE AIDMWSLGCV IAELFLGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLSAGTK TTRFFNRDTD SPYPLWRLKT PDDHEAETGI KSKEARKYTF NCLDDMAQVN 480 
MTTDLEGSDM LVEKADRREF IDLLKKMLT1 DADKRTTPIE TLNHPFVTMT HLLDFPHSTH 540 
VKSCFQNMH CKRRVNMYDT VNQSKTPFIT HVAPSTSTNL TMTFNNQLTT VHNQAPSSTS 600 
ATIStANPEV SILNYPSTLY QPS AASMAAV AQRSMPLQTO TAQICARPDP FQQALIVCPP 660 
GFQGLQASPS KHAGYSVRME NAVPIVTQAP GAQPLQIQPG LLAQQAWPSG TQQILLPPAW 720 
QQLTGVATHT SVQHATYIPE TMAGTQQLAD WRNTHAHGSH YNPIMQQPAL LTOHVTLPAA 780 
QPLNVGVAHV MRQQFTSTTS SRKSKQHQSS VRNVSTCBVS SSQA1SSPQR SKRVKENTPP 840 
RCAMVHSSPA CSTSVTCGWG DVASSTTRER QRQTWIPDT PSPTVSVm SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSPYSDSS SNTSPYSVQQ RAGHNNANAF DTKGSLENHC 960 
TGNPRTUVP PLKTQASEVL VECDSLVPVN TSHHSSSYKS KSSSNVTSTS GHSSGSSSGA 1020 
ITYRQQRPGP HFQQQQPLNL SQAQQHTTTD RTGSHRRQQA YTIPTMAQAP YSFPHNSPSH 1080 
GTVHPHLAAA AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGS ARHT VQHTAYPASI 1140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYIS ASPAST VYTGYPLSPA KVNQYPYI 

SEQ ID N0:283 PBY1 DNA SEQUENCE 

NucteteAcMAecassliH* HM.017700 

Coding sequence: 147-806 (imdaflned seijience corresponds to start and slop codon) 



1 
I 

AGTCACAGCC 
TCACTCTAGT 
GCCTCCAAAO 
TGGTCACCAA 
GGAAGACTGA 
GGAGCCAATC 
GTGA6TGTTC 
TGGCAAAGCG 
ATCTAGAGCT 
TGGATGTGGA 
TGTCATTGTT 
TATTCTTGCA 
ATGAAGCACA 
GCCACTGTAT 
GTTGCCTAGC 
TTTTATGAGA 
GAAGCCCTGA 
TTTAGAGCTC 
CTCTTCTTAA 
ACACGTGGCC 
ACTCTGCATO 
AGGCCTTGTT 
CATTGGTGGG 
TGAAAATCTG 
AAAAAAAAAA 



11 
I 

AGGTAACCCT 
A6CTTTAACC 
CTTGTCTTTG 
GAAAAAGAAT 
CACTGTGGTT 
TGACAGCACC 
TGTAGCTGAG 
GGAAAAGATC 
GTGTGTTAGG 
TAGCTTGTTT 
GGAAGAGGCC 
GATTAAAGGG 
TASTATACTG 
CCAGTCCTTA 
AGGGAACATT 
CTTGCTGGGA 
TTGACTTTTT 
TGCAGCGATT 
ASCTATTGTA 
CCATGACCAC 
GCCGTCTTCT 
AAAGTTAGTT 
GGAGCTACAC 
AAAAATATAA 
AAAAAAAAAA 



21 
I 

GGAGTGAASC 
CTCACCCTGA 
CCTAATATGS 
CTGGCCTTCT 
GAGAGCAGTG 
GAATACAACC 
ACCTTAACCC 
ATTAAGGAGC 
GAASTGGTTC 
AGCAACATTG 
ACAACAGACG 
CCACTGGAAG 
GAGTCCTATO 
AAGTAAGGCC 
TTAAATGGAT 
GCTCTGCTTT 
TTCCCCCTGC 
GAAAAATGCA 
ACTCGCCTGG 
TGGAGCACAT 
TTCCCCAAAC 
TCAGAACAAT 
AATGTACTTT 
CAAAGTATGT 
AAAA 



31 
I 

GGTTTAGTTA 
GGCACCTTAG 
AGCCCAAAGA 
TGAGGTCTAG 
TTTCTGGGGA 
AGAAATTACA 
CAGAGGAAGA 
TGAXACAGAC 
AGCCCCTGAG 
AGTCCGTGCA 
TGGAACOGGC 
ATATTTATAA 
AAAAGCAAGA 
TTTTCAAATG 
GTAGATGAAA 
GCATTCCCTT 
GAGAATGACT 
ATATCAAAAT 
CCCCACGTAG 
GGGTTAATGG 
TCACTGTGGG 
TACTCATGCC 
TTCTTTTCTA 
GTAAGATAAA 



41 
I 

GAAGSGAGCA 
CAATCAGCCA 
AGCCACTGGG 
ACTCTATATG 
CCACTCTGGC 
AGAAAAGATG 
GCATCATATG 
AGAAAAGGAT 
AAATAAAAAG 
TCAGATATCA 
CATGCAAGTA 
AATCTACTGC 
AGAGCTGAAG 
ATGATTCCCA 
GGTCTCACAT 
TATAAAAAGC 
AAAAATAACA 
ATAAAATGTG 
TTCAAGGATT 
AGTTAGGGGA 
GAGATGGGTG 
TTCCTTTCTC 
GAGGAAGTAT 
AACCCCTTGC 



51 
I 

GATAAACTCG 
TTGCCTGCAA 
AAAGAAAACA 
CTGGAGAGAA 
ACCTTGAGGA 
ACTCCACAGG 
AAGAGGATGA 
TATCTCAATC 
ACTGATAGGC 
GCCAAGCTGC 
ATTGGAGAAS 
TATCACCATG 
GAACATTTGA 
TCTCCTCTCA 
AAATCCTATG 
TGACATGCCA 
TGGAAGAAGA 
GAAGAAAAGC 
ATGTGAGATA 
ATGGCCTACA 
AAGACAAGTC 
ATCCCTAAAA 
CTATTCACTG 
TATTTCAAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



SEQ ID HfraUPBYl Protein seauence: 
Protein Accession 1: KP_060170 



11 



21 



31 



41' 



SI 

420 



WO 02/30268 



MEFKEATGKE HHVTKKKHIA PLRSRLYMLB RRKTOTWES SVSGDHSGTL RHSQSDRTBY 60 

UQKLQEKMTP QGECSVAETL TPEBEBHHKR MMAKREKIIK ELIQTEKDYL KDLELCVBEV 120 

VQPLRNKKTD HLDVDSLFSN IESVEQISAK LLSLLBBATT DVEPAMQVIG EVPLQIKGPL 180 
EDIYMYCYH HDEAHSOES YEKEEELKEH LSHCIQSLK 

SEQ 10 N&Z85 PBQ9DNA SEQUENCE 

Nudete Add Accession* XG6534 

Coding sequence: 523-2678 (umlataedscquanra corresponds to dart and slop codon) 

1 11 21 31 41 51 

CCCTTATGGC GATTGGGCGG CTOCAGAGAC CAGGACTCAG TTCCCCTGCC CTA8TCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAOTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CRCCCGGGCG TGATCTCACC 240 

ATQTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCOGAfi 300 

GTGTGCGAAG CCACCAASAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGO GCCACCGCGG 360 

TCTCCGGCCT GTCTCCftOCC TGTCGCCT6A 'GCTGOCTGAC AGTGACAATS ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC T6GCAOCAGG TCAAGTTCCT 600 

AACGAGTCTT CAGAGGAG6C AGCAGGAAGC TCAQAGAGCT 6CAAA0CAAC CGTGCCCATC 660 

TGTCAAGACA TTCCTGAGSA GAACATACAA GAAAGTCTTC CTCAAA GAAA AACCAG TCGG 720 

AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 180 

GAACGGCT6A ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 

AAATCTTT6G AAAGAGAACA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TAC6AGGAA6 960 

ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CA6AGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGAXGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AQAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAO CGAGTTTGTG 1260 

AATCAGCCCT ACTTGTTGTA CICCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATCT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAG6CT6 1440 

ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAQTT TSTTGTACGA 1560 

GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 

AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 

ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 

CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATOCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCJT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GG CTAAC AAG CAGTATTAAA ATTTCAGGAG 2700 

CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SFQ ID WQ--286 PB09 Prolelti sequence: 
Protein Accession I: O02108 

1 11 21 31 41 51 

I I I I I I 

KFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPXCQDI PEKNIQESLP 60 

QRKTSHSBVY LHTLAESICK LIFPEFERUJ VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 

QAVAAGVPVE V1KESLGEEV PK1CYEEDEN ILGWGGTLK DFLNSPSTLt, KQSSHCQEAG 180 

KKGRLEDASI LCLDKEDDPL HVYYFPPKRT TSLILPGI1K AAAHVLYETE VBVSLMPPCF 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LFCKTFPFHF KFDKDMTILQ 300 

PGNGXRRLHN RRDFQGKPNP EEYPBILTPK DIQTPSGIKT KUOiQPWRV RSHDH5VKKS 360 

SRVMDLKGQM IYIVESSAIL PI/GSPCVDRL EDPTGRGLYt SDIPIHNALR DWLIGBQAR 420 

AQDGLKKRLG KLKATLEQAH QAL EE E KK KT VDLLCSIPPC EVAQQLWQGQ WQAKKPSNV 480 

THLFSDIVGP TAICSQCSPL QVITMLNALY TRFDQQCGBL DVYKVETIGD AYCVAGGLHK 540 

421 
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BSDTHAVQIA LMALKMHELS DEVMSPHGBP IKKRIGLHSG SVFAGWGVK HPRYCLFGNK 600 

VTLANKFESC SVPRKHJVSP TTYRLLKDCP GFVFTPRSRE KLPPNFPSBI PGICHFtiDAY. 660 
QQGTNSKPCF QKKDVEDGNA NFLGKASGID 

SEQ ID K0287 PFD2 ONA SEQUENCE 

Nudefc Add Accession* NM.000720 

Cafng sequence 11M654(imdeitlnedsea^encecotresporaJ3 tosUirt ami stop axio^ 

1 11 21 31 41 51 

I I ! I I I 

AOAATAAGGO CACGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG CCCAGCACAG TGCCCTGCAC ACAGTAGTCQ CTCAATAAAT GTTCOTGGAT 120 

GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGQ CAGCAGCAAG CG6ACCACGC 180 

GAACGAGSCA AACTATQCAA GAGGCACCAG ACTTCCTCTT TCTGGTGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTCCAABC AAACTGTCCT GTCTTGGCAA GCTGCAATCG ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAO GATCTCTCTC 360 

CCAAAGAAAA CGTCAGCAAT ACGCCAAQAG CAAAAAACAG GGTAACTCGT CCAACA GCCO 420 

ACCTGCCCGC GCCCT T T T C T GTTTATCACT CAATAACCCC ATCC GAAGAG CCTGCATTAG 480 

TATAGTGGAA TGQAAACCAT TTGACATATT TATATTATTG GCTATTTTTG CCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CA3TCCCTGA AGATGATTCT AATTC AACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAG ATTAT 660 

AGCGTATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGG AATGGATGGA ATTTACTGGA 720 

TTTTGTTATA GTAATAGTAG GATT GT T TA G TGTAATTTTG GAACAATTAA, CCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTCCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGGGGTG CCCAGTTTAC AAOTT GTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCTTTTGQ TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAA AACA TO 1020 

TTTTTTTGCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGQCAC GGAATGTAGG AGTGGCTGGG TTGG CCCG AA 1140 

CGGAGGCATC ACCAACTTTG ATAACTTTGC CTTTGCCATG CTTACTGTQT TTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACGTGCTCTA CTGGGTAAAT GATGCGATAG GATGGGAATG 1260 

GCCATGGGTG TATTTTCTTA GTCTGATCAT CCTTGGCTCA TTTTTCGTCC TTAACCTGGT 1320 

TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGOAGA 1380 

TTTCCAGAAG CTCCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG ADGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

GGCCAAGGCG GGGCCCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TGGCGTCGCT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGQTTATCGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTCGGTGTGT 2040 

GCGCCTCTTA AGAATCTTCA AAGTGACCAG GCACTGQACT TCCCTOAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA ASTCCATCGC TTC GC TGTTG CTTCTGCTTT TTCTCTTCAT 2160 

TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TOTTCCAGAT 2280 

CCTGACAGGC GAAGACTGGA ATGCTGTGAT GTACGATGGC ATCATGGCTT ACGGGGGCCC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCTACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTOAAAGTCT 2460 

GAACACTGCT CAGAAAGAAG AAGCGGAAGA AAAGGAGAGG AAAAAGATTG CCAGAAAAGA 2520 

GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAGTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

CGATGTGCCA GTAGGGGAAG ASGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

CGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG CCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

GCTCATCAAC CACCACATCT TCACCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGCCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTOG OTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TOTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTGGG GTGTCTCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

GATTCTGAGG GTCTTAAGGG TCCTGCGTCC CCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TCTTCGTGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGCCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTGAAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

TTTCAACTTC GACAACGTCC TCTCTGCTAT GATGGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GCGTTGCTGT ATAAAGCCAT CQACTCGAAT GGAGAGAACA TCGGCCCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGGCTTTGT CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAGTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT GTTTGTCCTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG CCTGGAACAC GTTTGACTCC CTCATCGTAA TCGGCAGCAT 4020 

TATAGACGTG GCCCTCAGCG AAGCGGACOC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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TGCTACACCT GGGAACTCTG AAGAGAGCAA TAGAATCTCC ATCACCTTTT TCCGTCTTTT 4140 

CCGAGTGATG CGATTGGTGA AGCTTCTCAG CAGGGQGGAA GGCATCCGGA CAT TGCTGTG 4200 

OACTTTTATT AAGTCCTTTC AGGCGCTCCC GTATGTGGCC CTCCTCATAG CCATGCTGTT 4360 

CTTCATCTAT GCGGTCATTG GCATGCAGAT GTTTGGGAAA GTTGCCATGA CAGATAACAA 4320 

CCAGATCAAT AGGAACAATA ACTTCCAGAC ffTTTCCCCAG GC6GTGCTGC TGCTCTTCAG 4380 

GTQTGCAACA GOTGAGSCCT GGCAGGAGAT CATGCTGGCC TGTCTCCCAG GGAAGCTCTG 4440 

TGACCCTGAG TCAGATTACA ACCCCGGGGA CGAGTATACA TGTGGGAGCA A CTTT GCCAT 4500 

TQTCTATTTC ATCABTTTTT ACATGC1CTO TGCATTTCTG ATCATCAATC TGTTTGTGGC 4560 

TGTCAXCATG GATAATTTCG ACTATCTGAC CCGGGACTGG TCTATTTTGG GGCCTCACCA 4620 

TTTAGATGAA TTCAAAAGAA TATGGTCAGA ATATGACCCT GAGGCAAAG6 GAAG6ATAAA 46S0 

ACAOCTTCAT OTGGTCACTC TGCTTCGACG CATCCAGOCT CCCCTGGGGT TTGGGAAJ3TT 4740 

ATGTCCACAC AGGGTAGCGT GCAAGAGATT AGTTGCCATG AACATGCCTC TCAACAGTGA 4800 

CGGGACAGTC ATGTTTAATG CAACCCTGTT TGCTTTGGTT GGAAGGGCTC TTAAGATCAA 4860 

GACCGAAGGG AACCTGGAGC AAGCTAATGA AGAACTTCGQ GCTGTGATAA AGAAAATTTG 4920 

GAAGAAAACC AGCATGAAAT TACTTGACCA AGTTGTCCCT CCAGCTGGTO ATQATGAGGT 4980 

AACOGTGGGG AAGTTCTATG CCACTTTCCT GATACAGGAC TACTTTAGGA AATTCAAGAA 5040 

ACGGAAAGAA CAAGGACTGG TGGGAAAGTA CCCTGCGAAG AACACCACAA TTCCCCTACA 5100 

GGCGGGATTA A6GACACTGC ATGACAITGG GCCAGAAATC CGGCGTGCTA TATCGTGTGA S160 

TTTCCAAGAT GACOAGCCTG AGGAAACAAA ACGAGAAGAA GAAGATQATO TGTTCAAAAQ 5220 

AAATGGTGCC CTGCT TCGAA ACCATGTCAA TCA3CTTAAT AGTGATAGGA GAGATTCCCT 5280 

TCAGCAGACC AATACCACCC ACCGTCCCCT GCATGTCCAA AGGOCTTCAA TTCCACCTGC 5340 

AAGTGATACT GAGAAACGGC TGTTTCCTCC AGCAGGAAAT TCGGTOTGTC ATAACCATCA 5400 

TAACCATAAT TCCATAGGAA AGCAAGTTCC CACCTCAACA AATGCCAATC TCAATAATGC 5460 

CAATATGTCC AAAGCTGCCC ATGGAAAGOG GCCCAGCATT G6GAACCTT0 AGCATGTGTC 5520 

TGAAAATGGG CATCATTCTT CCCACAAGCA TGACCGGGAG CCTCAGAGAA GGTCCAGTGT 5580 

GAAAAGAACC CGCTATTATG AAACTTACAT TAGGTOOGAC TCAGGAGATG AACAGCTCCC 5640 

AACTATTtGC CGGGAAGACC CAGAGAXACA TGGCTATTTC AGGGACCCCC ACTGCTTGGO 5700 

GGAQCAGGAG TATTTCAGTA GTGAGGAATG CTACGAGGAT GACAGCICGC. GCACCTGGAG 5760 

CAGGCAAAAC TATGGCTACT ACAGCAGATA CCCAGGCAGA AACATCGACT CTCAGAGGCC 5820 

CCGAGGCTAC CATCATCCCC AAGGATTCTT •GGAGGACGAT GACTCGCOCG TTTGCTATGA 5880 

TTCACGGAGA TCTCCAAGGA GACGCCTACT ACCTCCCACC CCAGCATCCC ACCGGAGATC 5940 

CTCCTTCAAC TTTGAGTGCC TGCGCCGGCA GAGCAGCCAG GAAGAGGTCC CGTCGTCTCC 6000 

CATCWCCCC CATCGCAOGG CCCTGCCTCT GCATCTAATG CAGCAACAGA TCATGGCAGT 6060 

TOOCGGCCTA GATTCAAGTA AAGCCCAGAA GTACTCACCG AGTCACTCGA CCCGGTCGTG 6120 

GGCCACCCCT CCAGCAACCC CTCCCTACCG GGACTGGACA OCGTGCTACA CCCCCCTGAT 6180 

CCAAOTGGAG CAGTCAGAGG CCCTGGACCA GGTGAACGGC AGCCTGCCGT CCCTGCACCG 6240 

CAGCTCCTGG TACACAGACG AGCCCGACAT CTCCTACCGG ACTTTCACAC CAGCCAGCCT 6300 

GACTGTCCCC AGCAGCTTCC GGAACAAAAA CAGCGACAAG CAGAGGAGTG OGGACAGCTT 6360 

GGTGGAGGCA GTCCTGATAT CCGAAGGCTT GGGACGCTAT GCAAGGGACC CAAAATTTGT 6420 

GTCAGCAACA AAACACGAAA TCGCTGATGC CTGTGACCTC ACCATCGACG AGATGGAGAG 6480 

TGCAGCCAGC ACCCTGCTTA ATGGGAACGT GCGTOCCCGA GCCAACGGGG ATGTGGGCCC 6540 

CCTCTCACAC CGGCAGGACT ATGAGCTACA GGACTTTGGT OCTGGCTACA GCGACGAAGA 6600 

GCCAGACCCT GGGAGGGATG AGGAGGACCT GGCGGATGAA ATGATATGCA TCACCACCTT 6660 

GTAGCCCCCA GCGAGGGGCA GACTGGCTCT GGCCTCAGGT GGGGCGCAGG AGAGCCAGGG 6720 

GAAAAGTGCC TCATAGTTAG GAAAGTTTAG GCACTAGTTG GGAGTAATAT TCAATTAATT 6780 

AGACTTTTGT ATAAGAGATG TCATGCCTCA AGAAAGCCAT AAAOCTGGTA GGAACAGGTC 6840 

CCAAGCGGTT GAGCCTGGCA GAGTACCATG CGCTCGGCCC CAGCTGCAGG AAACAGCAGG 6900 

CCCCGCCCTC TCACAGAGGA TGGGTGAGGA GGCCAGACCI GCCCTGCCCC ATTGTCCAGA 6960 

TGGGCACTGC TGTGGAGTCT GCTTCTCCCA TGTACCAGGG CACCAGGCCC ACCCAACTGA 7020 

AGGCATGGCG GCGGGQTGCA GGGGAAAGTT AAAGGTGATG ACGATCATCA C ACCTCG TGT 7080 

CGTTACCTCA GCCAICGGTC TAGCAIATCA GTCACTGGGC OCAACATATC CATTTTTAAA 7140 
CCCTTTCCCC CAAATACACT GCGTCCTGGT TCCTGTTTAG C TGTT CT G AA ATA 

SEP ID HQ288 PFD2 Protein sequence: 
Protein Accession*: A38198 

1 11 21 31 41 51 

I I I I I I 

HMMMMMMKKM QHQRQQQADH ANEANYARGT KLPLSGEGPT SQPNSSKQTV LSWQAAIDAA 60 

RQAKAAQTMS TSAPPFVGSL SQRKRQQYAK SKKQGNSSNS HFARALFCLS LHNPIRRACI 120 

SIVEWKPFDI FILLAIFANC VALA1Y1PFP EDDSNSTNHN LEKVEYAPLI IFTVETFLKI 180 

IAYGLLLHPN AYVRNGWNLL DFVIV1VGLP SVILEQLTKE TEGGNHSSGK SGGFDVKALR 240 

APRVLRPLRL VSGVPSLQW LNSIIKAMVP LLHIALLVLP VIIIYAIIGL ELFIGKMHKT 300 

CFFADSDIVA EEDPAPCAPS GNGRQCTANG TECRSGWVGP NGGITNFDNF APAMLTVPQC 360 

ITMEGWTDVL YWVNDAIGWE WPWVYFVSLI ILGSFFVLNL VLGVLSGEFS KEREKAKARG 420 

DFQKLREKOQ LEEDLKGYLD WITQAEDIDP ENEEEGGEEG KEMTSMPTSE TESVNTENVS 480 

GEGENRGCCG SLWCWWRRRG AAKAGPSGCR RWGQAISKSK LSRRWRRWNR FHRRRCEAAV 540 

KSVTFYWLVI VLVPLNTLTI SSEHTOQPDW LTQIQDIAKK VLLALPTCEH LVKMYSLGLQ 600 

AYFVSLFBHF DCFWCGGIT ETILVELEIM SPLGISVPRC VRLLRIPKVT BHWTSLSNLV 660 

ASLLNSMKSI ASLLLLLFLF IIIFSLLGHQ LFGGKFNFDE TQTKRSTFDM FPQALLTVPQ 720 

ILTGEDWNAV KYDGIMAYGG PSSSGMIVCI YPIILPICGN YILLNVFLAI AVDNLADAES 780 

LNTAQKEEAE EKERKKIARK ESLENKKKNK PBVKQIANSD HKVTIDDTOB EDEDKDPYPP 840 

CDVPVGEEBE EEEEDEPEVP AGPRPFRISE IHMKEKIAPI PEGSAFFILS KTNPIRVGCH 900 

KLINHHIFTN LILVFIMLSS AALAAEDP1R SHSFRNT1LG YFDYAFTAIF TVEILLKMTT 960 

FGAFLHKGAF CKNYPNLLDK LWGVSLVSF GIQS5AISW KILRVLRVLR PLRAINRAKG 1020 

LKHWQCVFV AXRTIGN2HX VTTLLQFHFA CIGVQLFKGK FVRCTDEAKS NPEECRGLPI 1080 

LYKDGDVDSP WRERIWQNS DFNFKWLSA HMRLFTVSTF EGWPALLYKA IDSNGEH1GP 1140 

IYNHRVEISI FFIIYIIIVA FFMMNIFVGF VIVTFQEQGE KBYKNCELDK NQRQCVEYAL 1200 

KARPLRRYXP KNPYQVKFWY WNSSPFEKM KFVLQSLNTL CLAKOHYEQS KHFNDAMDIL 1260 

HMVPTGVFTV EHVLKVIAPK PKGYFSDAMN TPDSLZVIG5 IIDVALSEAD PTESENVPVP 1320 
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TATPGNSEES NRISITFFRI4 FRVMRLVKLL SRCEOIRTLIi WTFIKSFQAL PYVALLIAHL 1380 

FFIYAVI6MQ HPGKVAHRBN NQDJRNNNFQ TFPQAVLLLF RCATGEAWQB IHLACLPGKL 1440 

CDPESDYNPG EEYTCGSNFA IWPISFOTL CAFLIINLFV AVIHDNFDYL TRDWSII/3PH 1500 

BLDEFKRIWS ETOPEAKGRI KHLDWTLLR RIQPPLGFGK LCPHKVACKR LVAKNMPLNS 1560 

DGTVHFNATL PALVRTALKT KTBGNLEQAN EELRAVIKKI WKKTSMKLLD QWPPAGDDE 1620 

VTVGKFYATF LIQDYFRKFK KRKEQGLV6K YPAKNTTIAL QAOLRTLBDI GPEIRRAISC 1680 

DLQDDBPEET KREEBDDVPK RNGALLGNHV NHVNSDRHDS LQQTHTTBHP LHVQRPSIPP 1740 

ASDTEKPLFP PAGNSVCHNH HNBNSIGKQV PTSTOANLNN ANMSKAAHGK RPSIGKLEHV 1800 

SENGHHSSHK HDRBPQRRSS VKRTRYYETY IRSDSGDEQL PTICREDPBI HGYFRDPHCL 1860 

GBQEYPSSEE CTEDDSSPTW SRQNVGYYSR YPGHNIDSER PRGYHHPQGF LEDDDSPVCY 1920 

DSRRSPKRRL LPPTPASHRR SSFKFECLRR QSSQEEVPSS PIFPHRTALP LHLMQQQIMA 1980 

VAGLDSSKAQ KYSPSHSTRS WATPPATPPY RDWTPCYTPL IQVEQSEALD QVNGSLPSLH 2040 

RSSWYTDEPD ISYRTPTPAS LTVPSSFHNK NSDXQRSADS LVEAVLISEQ LGRYARDPKF 2100 

VSATKHEIAD ACDLTIDEME SAASTLLNGN VRPRANGDVG PLSHRQDYEL QDFGPGYSDE 2160 
EPDPGBDEED LADSMZCITT L 

seq id msaa cm dna sequence 

NudelcAddAccesslanl: NM.002812 

Coding sequence 150-3382 (underlined sequence corresponds to start and slop cotton) 



1 11 21 31 41 51 

jLcTCCCGOC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTQC TGCGGCGCCC 60 

OCGCTCCGGT GCGTCCGCCT CCTGTGCCCG CCGCGGAGCA GTCTGCGGGC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGCGA TGG GAGCTGC GCGGGQATCC CCGGCCAOAC 180 

CCOGCCGGTT GCCTCTGCTC AGCGTCCTGC TCCTGCOOCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CXMTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAAOGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC S40 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTQ TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGAOCCAGG 1020 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGG8AG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCGGGCZACT 1440 

TCGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAQAGA 1740 

AGCCCACIAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGAIGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATCT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGAGOCAGGA TGCACATCTT CCAGAATGGC TCOCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGOCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTGTGGC CTACATCATT GCCGTGCTGS GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACC 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGGAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTCGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGTA CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTO GAGATGTTTG 2700 

GGAAGCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT GTGCCGGGAG GCTGAGCCCC 2760 

ACTACATGQT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CCGCTTTGTG CATAAGGACT 2940 

TGGCTGCGCG TAACTSCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCOCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

CCTTCGGTGT GCTGATGTGG GAAGTCTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG CACCGTGGAC AGCAAGCCGT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 
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CAOCA.TGA.TG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCT A CT GCAACAGGCA 3480 

TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTTG GACCCAAACT GGGCGACTAG GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

UTCTTCC T gl' ATCAGGGACA GTGTGSGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

TTCTCCCCTT GACCGGGTCC AACTCTOCCA CTCATCTGCC AACTTTGCCT GGGQAGGGCT 3720 

AGGCTTGGQA TGAGCTGGST TTGTGGGGAQ TTOCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGGGTTAATG AGTCTCTTGC CCACTSGT CC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCACOTCTT 3900 

CCCCACCCTT CTCTCCrTTC CTCATCCTAA GTGOCTGGCA GATGAAGQAG TTTTCAGGAG 3960 

CTTTTGACAC TATATAAACC CCC CTTTTTQ TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGCGTGG GGTGGGTGGG CATGGSAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4080 

GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTTTTGTT TTTTTGTTTT 4140 
WnWWT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 



SEQ ID NO390 0BI6 Protein sequence: 
Protein Accession f: NPJ0O2812 

1 11 21 31 41 51 

I I I I I I 

MQAARGSPAR PRRLPLLSVL LLPLLGGTQT AIWIKQPSS QDALQGRRAL LRCEVEAPGP 60 

VHVYWIiLDGA PVQDTEEEPA QGSSLSFAAV ORLQDSGTFQ CVARDDVTGE EARSANASFN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEKSGLYSC CAHSAFGQAC SSQHFTLSIA DESFAHWLA PQDWVAKYB 240 

EAMFHCQPSA CPPPSLQWLP EDETPITNRS RPPHLRRATV FANGSLLLTQ VRPRNAGIYR 300 

CIGQGQRGPP IILEATLHLA EIEDMPLFBP RVFTAGSEER VTCLPPKGLP EPSVMWEHAG 360 

VRLPTHGRVY fiKGHELVLAH IAESDAGVYT CHAANLAGQR RQDVNITVAT VPSWLKKPQD 420 

SQLEEGKPGY LDCLTQATPK PTWWYRNQM LISEDSRFBV PKHGTLRINS VEVYDGTWYR 480 

CMSSTPAGSI EAQARVQVLE KLKFTPPPQP QQCMEPDKEA TVPCSATGRE KPTIKWERAD 540 

GSSLPEKVTD HAGTLHFARV TRDDAGNYTC IASNGPQGQI EAHVQLTVAV FITFKVEPER 600 

TTVYQGHTAL LQCEAQGDPK PLIQWKGRDR ILEPTKLGPR HHIPQNGSLV IHDVAPEDSG 660 

RYTCIAGNSC NIKHTEAPLY WDKFVPEES EGPGSPPPYK MIQTIGLSVG AAVAYIIAVL 720 

GLMFYCKKRC KAKRLQKQPE GEEPEMBOJI GGPLQNGQPS AEIQEEVALT SLGSGPAATN 780 

KRHSTSDKMH FPRSSLQPIT TLGKSEPGE7 FLAKAQGLEE GVAETLVLVK SLQTKDEQQQ 840 

LDPRRELEHP GKLNHANWR LLGLCREAEP HVMVLBYVDL GDLKQFLRIS KSKDEKLKSQ 900 

PLSTKQKVAL CTQVALGMEH LSKNRFVHKD LAARNCLVSA QRQVKVSALG LSKDVYNSEY 960 

YHFRQAWVPL EWKSPEAILE GDFSTKSDVW AFGVLHWEVP THGEMPHGGQ ADDBVLADLQ 1020 
AGKARLPQPE GCPSKLYRLM QRCWALSPKD RPSFSEIASA LGDSTVDSKP 



SEQ ID NO:291 AAB1 DNA SEQUENCE 

Nucleic Acid Accession I: NM.002205 

Coding sequence: 1-3150 (underlined sequences correspond to start and slop codons) 



1 11 21 

I I I 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC 
CGCCGACCCC CGCTSSTGCC GCTGCTGTTG 
GGCTTCAACT TAGACGCGGA GGCCCCAGCA 
GGATTCTCAG TGGAGTTTTA CCGGCCGGGA 
CCCAAGGCTA ATACCAGCCA GCCAGGAGTG 
TGGGGTGCCA GCCCCACACA GTGCACCCCC 
CTGGAGTCCT CACTGTCCAG CTCAGAGGGA 
TGGTICGGGG CAACAGTTCG AGCCCATGGC 
AGCTGGCGCA CAGAGAAGGA GCCACTGAGC 
GATAACTTCA CCCGAATTCT GGAGTATGCA 
GGACAGGGTT ACTGCCAAGG AGGCTTCAGT 
TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA 
ATTGCAGAAT CTTATTACCC CGAGTACCTG 
CGCCAGGCCA GTTCCATCTA TGATGACAGC 
TTCAGTGGTG ATGACACAGA AGACTTTGTT 
GGCTATGTCA CCATCCTTAA TGGCTCAGAC 
CAGATGGCCT CCTACTTTGG CTATGCAGTG 
GATGACTTGC TGGTGGGGGC ACCCCTGCTC 
GAGGTGGGCA GGGTCTACGT CTACCTGCAG 
CTTACCCTCA CTGGCCATGA TGAGTTTGGC 
GACCTGGACC AGGATGGCTA CAATGATSTG 
CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG 
CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC 
CGAGGAGGCC GAGACCTGGA TGGCAATGGA 
GTGGACAAGG CTGTGGTATA CAGGGGCCGC 
ATCTTCCCCQ CCATGTTCAA CCCAGAGGAG 
GCCTGCATCA ACCTTAGCTT CTGCCTCAAT 
GGTTTCACAG TGGAACTTCA GCTGGACTGG 
CTGTTCCTGG CCTCCAGGCA GGCAACCCTG 
CGAGAGGATT GCAGAGAGAT GAAGATCTAC 
CTCTCGCCGA TTCACATCGC TCTCAACTTC 
CACGGCCTCA CGCCAGCCCT ACATTATCAG 
ATCTTGCTGG ACTGTGGAGA AGACAACATC 



31 41 51 

I I I 

CACGCCGTGC AGCTGCGCTC GGGCCOCCGG 60 

CTGCTSSTGC CGCCGCCACC CAGGGTCGGG 120 

GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 

ATTGAATTTG ACAGCAAAGO CTCTCGGCTC 360 

GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TCCTCCATCT TGGCATGCGC TCCACTGTAC 480 

GAOCCCGTGS GCACCTGCTA CCTCTCCACA 540 

CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

GGCCAGATCC TGTCTGCCAC TCAGGAGCAG 720 

ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

TACCTASGAT ACTCTGTGGC TGTTGGTGA& 840 

GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

ATTCGATCCC TCTACAACTT CTCAGSGGAA 960 

GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

ATGGATCGGA CCCCTGACGG GCGGCCTCAG 1080 

CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CGATTTGGCA GCTCCTTGAC CCCCCTGGGG 1200 

GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CACACCCCAG ACTTCTTTCG CTCTGCCCTT 1380 

TATCCTQATC TGATTGTGGG GTCCTTTGGT 1440 

CCCATCGTGT CCGCTAGTGC CTCCCTCACC 1500 

CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

TGTGTGCCTG ACCTGCAGCT GGAAGTGTTT 1980 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATGOC 2040 

CAGAATGTGG GTGAGGGTGO CGCCTATOAG GCTGAGCTTC GCGTCACCGC CCCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCOTGA ACCAGAGCCO CCTGCTGOTG TGTGAOCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGSGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAO CGACQTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGOC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTCCCAOT AAGCGACTG6 CATCCCCCAG ACCAGCCTCA GAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATOTCTAT GAGCTCATCA ACCAAG6CCC CAGCTCCATT 2520 

AGOCACGGTG TGCTGGAACT CAGCTSTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGQGACT CAACTCCACC ACCAATCACC CCATTAACCC AAA0X3GCCTG 2640 

GAGTTGGATC 0CGA6GGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTOC AAGCCSCAGC 2700 

TCTGCTTCCT 0G6GACCTCA GATCCTGAAA TG00C66A6G CTGAGTGTTT CAG6CT6CGC 2760 

TGTGAGCTCQ GGCCCCTGCA CCAACAAGAG ACCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

T6GGCCAAGA CTTTCTTOCA GCGGGAGCAC CAGOCATTTA GCCTGCAGTG TGAGGCTGTG 2880 

XACAAAOCOC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTOCCCCA AAAAGACCGT 2940 

CAGGTGGOCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATCGA AAAAGCTCM3 3120 
CTCAAGGCTC CAGCCACCTC TCATGCCTGA 



SEP ID NQ392 AAB1 Proton seouence: 
Protein Accession fc NP.002196 

1 11 21 31 41 51 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPLLPLLI* LLLPPPPF.VG GPHUJAEAPA VLSGPPGSFP 
GFSVEFYBPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 
LBSSLSSSES EEPVEYKSLQ WTOATVHAHG SSILACAPLY SWRTEKBPLS DPVGTCYLST 
DHFTRXXtBYA PCBSDFSHAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 
IAESYYPSYL HJLVQGQLQT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 
GYVTILNQSD IRSLWPSGE QUASYFGYAV AATDVNGDGL DDLLVGAPLL HDRTPDGRPQ 
EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 
QQGWPVFPG GPGGLGSKPS QVLQPWAAS HTPDFFGSAL RGGRDLDGNG VPDLIVGSFG 
VDKAWYRGR PIVSASASLT IFPAHFNPB8 RSCSLBGNPV ACINLSPCLN ASGKHVADSI 
GFTVELQLDW QKQKGGVRRA LFLASRQATL TQ1XLIQNGA REDCREHKIY LHNESBFRDK 
I^SPIHIAIJJF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGBDNI CVPDLQLEVP 
GBQNHVY16D KHALNLTFHA QNVGEGGAYE AEUCVTAPPE AHYSGLVRHP GKFSSLSCDY 
PAVNQSRLLV CDLGNPHKAG ASLWGGLRPT VPHLHDTKKT XOFSFQILSX KLNNSQSDW 
SFRLSVEAQA QVTLNGVSKP EAVLFPVSEFW HPRDQPQKEE DLGPAVHHVY ELXNQGFSSI 
SQGVLELSCP QALEGQQLLY VTKVTGLNCT TNHPINPKGL ELDPEGSLBH CQKREAPSRS 
SASSGPQILK CPBAGCFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCRAV 
YKALKKPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLIYIL 
YKLGPPKRSL PYGTAMEKAQ LKPPATSDA 



SE0 B> N0293 IBH4 DMA SEQUENCE 

NucU&AcUAccesston*-. 8C001291 

Coding sequence: 44-541 (start and stopcodons are undefined) 



1 11 21 31 41 51 

I I I I I I 

GGGGGOGCOG CGCGCTGACC CTCCCTGGGC ACCGCTGGGG ACG^IQGCGC TGCTCGCCTT 60 
GC1GCIOGIC GTGGCCCTAC CGCGGGTCTG GACAOACGCC AAOCTGACTG CGAGACAACG 120 
AGATCCAGAG GACTCCCAGC OAACGGAOGA GGGTGACAAT AGAGTGTGGT GTCATGTTTG 180 
TOAGAGAGAA AACACTTTCG AGTOCCAGAA CCCAAOOAGG TGCAAATGGA CAQAGCCATA 240 
CTGCGTTATA GCGGCCGTGA AAATATTTOC ACGTTTTTTC ATGGTTGCGA AGCAGTGCTC 300 
CGCTGGTTGT GCAGCGATGG AGAGACCCAA GCCAGAGQAO AAGCGGTTTC TCCTGGAAGA 360 
GCCCATGCCC TTCTTTTACC TCAAGTGTTG TAAAATTCGC TACTGCAATT TAGAGGGGCC 420 
ACCTATCAAC TCATCAGTGT TCAAAGAATA TGCTGGGAGC ATGGGTG AGA GCTQTGGTGG 480 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GCCGGCCTCA GCCTGTCT12 540 
AGCCACGGGA CTGCCACAGA CTGAGCCTTC CGGAGCATGQ ACTCGCTCCA GACCGTTGTC 600 
ACCTGTTGCA TTAAA C I 1 G1 1 1 1CTGTTGA TTACCTCTTG GTTTG ACTTC CCAGGGTCTT 660 
GGGATGGG AG AGTGGGGATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT TCTTTAACTC 720 
ACATTCAGAG GAAGTCCAGA TCTCCTGAGT AGTGATTTTG GTGACAAGTT nTCTCTTTG 780 
AAATCA AACC TTGTAACTCA TTTATTGCTG ATGGCCACTC TTTTCCTTGA CTCCCCTCTG 840 
OCTCTG AGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TAOCACTCAT GGAGAGTATG 900 
TGCTGAG ATG CTTCCGACCT TTCAGGTGAC GCAGGAACAC TGGGGG AGTC TO AATGATTO 960 
GGGTOAAGAC ATCCCTCGAG TGAAGGACTC CTCAGCATGO GGGGCAGTGG GGCACACGTT 1020 
AGGGCTGCCC CCATTCCAGT GOTGGAGGCG CTGTGGATGG C10C 1 1 UC C TCAACCTTTC 1080 
CTACCAGATT OCAGGAGGCA GAAGATAACT AATTGTGTTG AAGAAACTTA GACTTCACCC 1140 
ACCAGGTGGC ACAGGTCCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 
ACTTAGGCCA AGTAG AGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT AAGATGCAGC 1260 
CATCCATGGG GAGCTC AGAA ATCAGACTCA AAGTTCCACC AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTTCACGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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10 



15 



SEP ID HO-.ZH LBH4 Protein sequence: 
Protein Accession!: AAH01291 



1 11 21 31 41 51 

MALIALLLVV ALPRVWTDAN LTARQRDPBD SQRTDEGDNR VWCHVCEREN TFECQNPRRC 60 
KWTEPYCVIA AVKIFPRFFM VAJCQCSAGCA AMERPKPEEK RFLLEEPMPP FYLKCCKIRY 120 
CNLEOPPINS SVFKEYAGSM GESCGGLWLA ILLLLASIAA GLSLS 



It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
20 application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 



1 1. A method of detecting a prostate cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-16. 

1 2. The method of claim 1, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 8. The method of claim 1, wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat prostate cancer. 

1 12. The method of claim 1 , wherein the patient is suspected of having 

2 prostate cancer. 
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1 13. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (m) comparing 

2 the level of the prostate cancer-associated transcript to a level of the prostate cancer- 

3 associated transcript in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated antibody in the 

6 biological sample by contacting the biological sample with a polypeptide encoded by a 

7 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

8 as shown in Tables 1-16, wherein the polypeptide specifically binds to the prostate cancer- 

9 associated antibody, thereby monitoring the efficacy of the therapy. 

1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated antibody to a level of the prostate cancer- 

3 associated antibody in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 18. The method of claim 16, wherein the patient is a human. 
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1 19. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of : 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) detennining the level of a prostate cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 

9 the efficacy of the therapy. 

1 20. The memod of claim 19, farther wmprismgtte 

2 the level of the prostate cancer-associated polypeptide to a level of the prostate cancer- 

3 associated polypeptide in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 21 . The method of claim 19, wherein the patient is a human. 

1 22. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1-16. 

1 23. The nucleic acid molecule of claim 22, which is labeled. 

1 24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

1 25. An expression vector comprising the nucleic acid of claim 22. 

1 26. A host cell comprising the expression vector of claim 25. 

1 27. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1-16. 

1 28. An antibody that specifically binds a polypeptide of claim 27. 

1 29. The antibody of claim 28, further conjugated to an effector component 
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1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label. 

1 31. The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody fragment 

1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a prostate cancer cell in a biological sample 

2 from a patient, the method comprising contacting the biological sample with an antibody of 

3 claim 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

t 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to prostate cancer in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide encoded by a nucleic acid comprises a sequence from Tables 1-16. 

1 38. A method for identifying a compound that modulates a prostate cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a prostate cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect 
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1 40. Hie method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41 . The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 

1 44. A method of inhibiting proliferation of a prostate cancer-associated 

2 cell to treat prostate cancer in a patient, the method comprising the step of administering to 

3 the subject a therapeutically effective amount of a compound identified using the method of 

4 claim 38. 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having prostate cancer or a 

3 cell isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of prostate cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with prostate 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 
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1 50. A method for treating a mammal having prostate cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutical composition for treating a mammal having prostate 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 

1 52. The method according to claim 1, wherein said biological sample is 

2 contacted with a plurality of polynucleotides comprising a first polynucleotide that 

3 selectively hybridizes to a sequence at least 80% identical to a first sequence as shown in 

4 Tables 1-16; and a second polynucleotide that selectively hybridizes to a second sequence at 

5 least 80% identical to a second sequence as shown in Tables 1-16. 

1 53. A method according to claim 52, wherein the plurality of 

2 polynucleotides comprises a third polynucleotide that selectively hybridizes to a sequence at 

3 least 80% identical to a third sequence as shown in Tables 1-16.. 

1 54. A method of detecting a prostate cancer associated transcript, the 

2 method comprising contacting a biological sample from the patient with a plurality of 

3 polynucleotides wherein at least two of said polynucleotides selectively hybridize to a 

4 difference sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 55. A method of detecting a prostate cancer, the method comprising the 

2 steps of: 

3 (i) providing a biological sample from a patient; 

4 (ii) contacting the biological sample with a first polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a first sequence as shown in Tables 1-16 to 

6 determine the level of a prostate cancer-associated transcript in the biological sample; and 

7 with a second polynucleotide that selectively hybridizes to a second sequence at least 80% 

8 identical to a sequence not shown in Tables 1-16; wherein the expression of said second 

9 sequence is not substantially changed in prostate cancer, to detemine the level of expression 
10 of a control transcript in the biological sample; 
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1 1 (iii) comparing the level of the prostate cancer-associated transcript to a level 

12 of the normal tissue associated transcript in the biological sample. 

1 56. A method of quantitating a prostate cancer-associated transcript in a 

2 cell from a patient, the method comprising contacting a biological sample from the patient 

3 with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 

4 sequence as shown in Tables 1-16. 

1 57. The method of claim 56, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 58. The method of claim 56, wherein the biological sample is a tissue 

2 sample. 

1 59. The method of claim 56, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 60. The method of claim 56, wherein the nucleic acids are mRNA. 

1 61. The method of claim 59, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 62. The method of claim 56, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 63. The method of claim 56, wherein the polynucleotide is labeled. 

1 64. The method of claim 63, wherein the label is a fluorescent label. 

1 65. The method of claim 56, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 . 66. The method of claim 56, wherein the patient is undergoing a 

2 therapeutic regimen to treat metastatic prostate cancer. 

1 67. The method of claim 56, wherein the patient is suspected of having 

2 metastatic prostate cancer. 
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1 68. A biochip comprising a plurality of polynucleotides that selectively 

2 hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 69. A method of screening drug candidates comprising: 

2 i) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Tables 1-16 or fragment thereof; 

4 ii) adding a drug candidate to said cell; and 

5 iii) determining the effect of said drug candidate on the expression of said 

6 expression profile gene. 

1 70. A method according to claim 59 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candidate to the level of 

3 expression in the presence of said drug candidate. 

1 SF 1277890 vl 
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